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FOREWORD 


Geometrical optics is an outstanding example of the permanence 
of a basic subject. Though its history extends over many centuries 
there is today a ground swell of renewed interest in it, for which the 
probe into space, the laser, and computing machines are perhaps 
responsible. The large computers, in particular, have required 
critical examination of all the tried and tested procedures of analy- 
sis. Persons involved in laser cavity design, electron optics, crystal 
optics and focusing of nuclear particles and Cherenkov radiation, 
have all turned to the basic literature in geometrical optics. 

After hundreds of years of use and development, it is not sur- 
prising that the literature in the field is extensive. It is difficult 
for a newcomer to the field to know where to start his learning. 
The original works are usually excellent sources, but they are 
often difficult to obtain and to read because of their language and 
fragmented thought. Most readers today look for a modern treat- 
ment, and here lurk severe difficulties. 

Many of the modern writings suffer from a variety of basic 
defects. Thus, the author’s interests are too mathematical, or else 
too casual, and their work does not reveal the kind of feeling for 
the subject which actual practice demands. Then again several 
designers of optical instruments have attempted to describe the 
understanding they acquired through years of experience. These 
authors are often not trained in the discipline of mathematical 
writing. The result is that they derive equations clumsily, use 
poor notation, and usually over-step their understanding by in- 
correctly generalizing from a few special cases. 

This book now provides us with a modern account of the Hamil- 
tonian treatment of geometrical optics which overcomes the criti- 
cisms above. Dr Buchdahl has written in various fields of theo- 
retical physics in which he has proven his ability to straighten 
things out. His interest in geometrical optics is far from casual. 
He has spent several years writing on aberration theory, and the 
formulae derived include fifth, seventh and higher orders. His 
formulation has been coded into machine language and resides 
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on the memory disks of computing machines throughout the 


world. 
Persons with a wide variety of interests can now expand their 


understanding of this fascinating field. Practising engineers—who 
have learned optics by doing optics—will find that their vague 
hunches and concepts can indeed often be formulated with pre- 
cision. Those who are primarily interested in the more formal 
aspects of the subject will find the book satisfying and instructive. 
The beginner may be confident that it will keep him on a straight 
path; and he will find that Buchdahl’s understanding runs so 
deep that his presentation flows with ease. 

It is my hope that this book will serve as an authoritative up- 
dating of optical aberration theory, and that future authors in this 
field will be guided by Buchdahl’s notation and organization of 
concepts. 

R. E. HOPKINS 
Rochester, April 1968 


PREFACE 


Although the propagation of light is described in all its details 
by Maxwell’s equations, many optical problems can be solved to 
a sufficient degree of approximation on the basis of the laws of 
geometrical optics. On this level a thorough understanding of the 
over-all features of the imagery produced by the types of systems 
commonly encountered in practice is most easily gained through 
Hamilton’s method. This presentation of it is partly intended to 
stimulate the teaching of the subject; in at least equal measure it is 
directed towards those whose task it is actually to design image- 
forming instruments. For this reason, if no other, the mathematical 
knowledge required of the reader has been kept to a minimum. 
For example, although the whole theory arises directly from a 
variational principle, Euler’s equations or formal variational deri- 
vatives are never considered explicitly. They are replaced through- 
out by elementary geometrical constructions. If some of the latter 
pages look a little bewildering at first glance, this is solely because 
they concern themselves with the kind of detail which is often 
quietly swept under the carpet; without which, however, one has 
merely a general framework, far removed as yet from what one 
needs to know in everyday practice... 

I have resisted the temptations of sophistication: with rare 
exceptions mathematical niceties have been given short shrift, 
whilst the many analogies which can be drawn between the 
methods of Hamiltonian optics and those arising in other branches 
of physics have been left aside. Pedagogically this is regrettable, 
bearing the present unpopularity of geometrical optics as a subject in 
mind. In this context the following remarks may not be out of place. 

The power of the Hamiltonian method lies in its ability to 
yield over-riding results governing the behaviour of types of 
systems defined solely with respect to the symmetries they possess. 
In terms of contemporary jargon, we have indeed a typical ‘black 
box theory’: the actual constitution of the optical system is left 
unspecified, that is, nothing is said as to whether it is made up of 
a finite number of refracting (or reflecting) surfaces, or whether 
the refractive index varies continuously, or partly both; nor is 
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anything said as to the precise shape of any refracting or reflecting 
surfaces which may be known to be present. In short, the only in- 
formation which is regarded as given is that the system as a whole 
has certain symmetries. Any single one of these is symmetry 
about either a line or about a plane; but the system may have 
several such symmetries simultaneously. Thus the kind of system 
conventionally called merely ‘symmetric’ has two, namely about 
a line, and a plane containing this line; whereas the semi-symmetric 
system has only the first of these. Now, to every symmetry of a 
system there corresponds an invariance property of its charac- 
teristic function; that is to say, there is a linear substitution of the 
ray-coordinates which leaves its form unaffected. Every symmetry 
therefore implies a restriction upon the generic form of the charac- 
teristic function, or more particularly, of its representation as 
a power series. Its coefficients are the (characteristic) aberration 
coefficients; and if we contemplate in the first place the set of 
nth-order coefficients of the most general system, any condition 
of symmetry subsequently imposed upon the latter implies rela- 
tions between these coefficients which may, in particular, take the 
form of necessarily assigning to some of them the value zero. 
At any rate, these relations impose powerful limitations upon the 
generic character of the aberrations produced by the system; and we 
may study them without being encumbered by the need to deter- 
mine how their values depend upon the details of its constitution. 
The situation just outlined is a counterpart to that which pre- 
vails in quantum mechanics. To any symmetry of a physical 
system there corresponds an invariance property (or, as one says, 
‘a symmetry’) of the Hamiltonian, which now takes the place 
of the characteristic function. The analogy can be pursued in 
various ways. For example, we might contemplate the wave- 
function of a general physical system (the number of degrees of 
freedom alone being prescribed), and suppose it to be expanded 
in terms of some basis: then any symmetry of the Hamiltonian will 
imply certain relations between the coefficients of the expansion. 
Suffice it to say that throughout theoretical physics a great deal of 
attention is focused upon the exploitation of known ‘symmetries’, 
especially in circumstances in which they constitute all the avail- 
able information; as may be the case in some aspects of scattering 
theory in which the region of interaction is then simply a ‘black 
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box’. Granted that it is desirable to introduce students to such rela- 
tively sophisticated notions as early as possible, it is evidently 
much easier to do so in the context of geometrical optics, since 
one is then not confronted with the largely irrelevant conceptual 
difficulties which quantum mechanics presents. 

Even when the pedagogic virtues of the theory which have 
just been discussed are set entirely aside, it is still desirable to deal 
in orderly progression with classes of systems defined by the 
various symmetries they possess; and that is just what is done in 
this monograph. Any redundancy which such a procedure entails 
is more than counterbalanced by the way in which repetition 
engenders easy familiarity. With the demands of practice upper- 
most in my mind I have tended to steer clear of ‘general’ theory— 
there is no discussion of the Maxwell fish eye, for example. Instead, 
I have elaborated in much detail all manner of problems relating 
to higher-order aberrations; whilst I have tried to clear up the 
doubts and misunderstandings which one encounters again and 
again in the context of the sine-relation and sine-condition, and 
of the so-called D—d method in the theory of chromatic aberra- 
tions, to cite but two examples. Further, the detailed treatment 
of questions related to reversibility, the sine- or cosine-relations, 
and the like, is not confined to the ‘symmetric system’ alone. 
Again, anyone who has ever encountered the (primary) Petzval 
curvature of field must have wondered whether, or in what sense, 
there exist higher-order aberrations with analogous properties. 
This question also is answered in much detail under the heading 
of ‘invariant and semi-invariant aberrations’, once the theory of 
shifts of the object and of the stop has been dealt with at length. 
At any rate, at this point the reader may do well to read through the 
list of contents, for the section headings will tell him concisely 
what he may expect to find. He may be surprised at the presence 
of a chapter on anisotropic media; but its inclusion is made 
necessary by the need to show that the explanations of certain 
earlier, seemingly paradoxical, results are indeed meaningful 
within the framework of the theory as a whole. 

At least by implication I have already expressed the view that 
the importance of Hamilton’s method derives mainly from its 
ability to yield, in the simplest possible way, general statements 
about what, in principle, a given type of system can or cannot do. 
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That is not to say, however, that Hamiltonian optics deals solely 
with such generalities. On the contrary, it is quite possible to 
compute the actual numerical values of the coefficients of the 
characteristic functions of a given system upon the basis of expres- 
sions derived from within the theory in hand. Chapter 12 indeed 
represents a brief introduction to this aspect of the subject. If 
I have any lingering doubts in this context, they concern merely 
certain purely practical considerations; namely, whether direct, 
that is to say Lagrangian, methods are not better suited to the 
purposes of numerical work. At any rate, this vexed question is 
discussed at length in Section 116. 

It has already been remarked that the mathematical level of this 
work has deliberately been left as low as possible. That the array 
of symbols and affixes of all kinds is rather formidable cannot be 
denied. It is sometimes claimed that the presence of a variety of 
superscripts, subscripts, asterisks, primes, and so on makes a 
theory difficult to read. It seems to me that this is true only from 
the point of view of superficial appearance. The more affixes there 
are the greater is the amount of explicit information contained 
within each (composite) symbol, and the interpretation of any given 
equation becomes correspondingly easier. As more and more 
affixes are suppressed, more and more sources of ambiguity arise, 
and ever greater feats of memory are required. At any rate, a 
list of the principal symbols used is included which may, in par- 
ticular, help anyone who is not in the habit of reading books 
systematically from end to end. 

Whilst on the subject of notation, it should be remarked that 
the sign conventions used are, of course, those of analytical 
geometry. The basic terminology, and the use of the symbols 
V, T and W for the point, angle, and mixed characteristics respec- 
tively naturally goes back to Hamilton. However, in one important 
point I have failed to follow Hamilton’s example, albeit very 
reluctantly: the values of quantities defined both in the object 
space and in the image space are represented by unprimed and 
primed symbols respectively. Hamilton adopted the opposite 
convention, which is much more sensible, since one is concerned 
most of the time with the details of the disposition of rays in the 
image space. Alas, the modern convention is so firmly entrenched 
that it seemed better not to run counter to it. 
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Problems are provided at the end of each chapter. Since they 
would be useless for any reader who cannot solve them even after 
prolonged attempts to do so, their solutions are given in reasonable 
detail at the end of the book. 

It is my earnest hope that no author will feel aggrieved if he 
finds no acknowledgement of his work here. The truth is that I 
may not know of it, or else that it has not contributed substantially 
to the material presented, for I have worked almost everything 
out from scratch. G. C. Steward alone I must mention explicitly, 
for the style and content of his writings have been a constant 
source of inspiration to me. Consistently with what has just been 
said, the bibliography is very circumscribed, and if it seems to 
lean towards the mathematical side, this is solely because I have 
found just the quoted works most useful in securing my own 
knowledge; and at the same time I nowhere make any claims to 
priority, though I believe certain parts of this work to be original. 

It is a pleasure to express my warmest thanks to the Institute of 
Optics in the University of Rochester, N.Y., and particularly to 
its Director, Dr W. L. Hyde, and to Professor R. E. Hopkins for 
the hospitality I enjoyed whilst I was a visitor there as New York 
State Professor of Optics during the academic year 1967-8. Funds 
to support this professorship were provided by the New York 
State Science and Technology Foundation under a grant from the 
state legislature. It was during that time that this monograph was 
written, and I presented the material of the first seven or eight 
chapters as a one-semester course to graduate students of the 
Institute. My thanks go to Mrs Dorothy McCarthy for the com- 
petent way in which she typed a demanding manuscript. I derived 
much benefit from the many discussions I had with Dr Peter 
Sands; and for these I am most grateful to him. To the Cambridge 
University Press I wish to express my warmest appreciation for 
their unfailing helpfulness and courtesy; and last, but not least, 
it is a pleasure to express my appreciation to Miss Jackie Flint 
for her remarkably careful and patient assistance in reading the 
proofs. 

H. A. BUCHDAHL 
Rochester, March 1968, and 
Canberra, May 1969 


CHAPTER I 


INTRODUCTION 


1. Introductory remarks 


Light is an electromagnetic disturbance characterized by its rapid 
oscillation in time and space. Its propagation through a transparent 
medium is therefore described in all its detail by Maxwell’s equa- 
tions, at any rate on the classical (i.e. non-quantal level). However, 
the wavelengths with which one is concerned are very small, being 
of the order of 10-°cm; for which reason a good approximation to 
the laws of propagation may be obtained in many situations by dis- 
regarding the finiteness of the wavelength altogether. In other 
words, one contemplates the formal limit in which the wavelength A 
is allowed to tend to zero (see Section 115). All questions concerning 
the phenomena of interference and diffraction (which are of course 
present in any actual physical situation) then remain unanswered. 
One often disregards also the polarization of light, as indeed we 
shall do here. One is left with a basically very simple theory in 
which the laws of propagation bear an essentially geometrical 
character ; for which reason this theory is called Geometrical Optics. 

At this stage it is convenient to introduce the restriction that, 
except in Chapter 11, all optical media to be considered are isotropic. 
A medium is isotropic if at every point its physical properties are 
independent of direction; in the contrary case it is anisotropic. 
(Actually, as will be explained in detail in Section 110, almost all 
of our work remains valid in the presence of internal anisotropy ; 
meaning that isotropy is required only in certain regions surround- 
ing the object and image.) Isotropy has the important consequence 
that the direction of energy flow is in the direction of the normal 
to the electromagnetic wavefront. One may therefore contemplate 
a set of curves, the rays, such that the tangent to a ray at any point 
gives the direction of energy transport at that point, i.e. the direction 
of propagation of light. This construction remains meaningful in 
the geometrical-optical limit. Accordingly two alternatives offer 
themselves: to base the theory either directly on the construction 
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and properties of wavefronts, or else on the consideration of the 
properties of families of rays. It is a matter of taste which of these 
alternatives one adopts. In any case the two methods are so closely 
related that the differences between them are little more than a 
matter of detail. Given a set, or ‘congruence’, of rays which origin- 
ated from a point source, these rays must be the orthogonal tra- 
jectories of a set of surfaces, and each such surface is just a wave- 
surface or wavefront. Accordingly, the wave-surfaces can be con- 
structed from the rays, and vice versa. Parenthetically, given any 
congruence of curves, there need by no means exist a family of 
surfaces cut orthogonally by the curves. If such a family does exist 
the congruence is normal, otherwise it is skew. For the present we 
assume that in every case through every point of the region of inter- 
est there passes one and only one curve. 

It may not be out of place here to remark that to some extent a 
ray can be realized physically by allowing light from a distant 
source to fall upon a screen provided with a circular aperture of 
sufficiently small radius r. The tube of light which emerges from 
the screen shrinks to a curve as 7 -> 0, and this curve is a ‘ray’. It is 
of course essential to recall that diffraction effects are being dis- 
regarded, since we are pretending that A = o. In actual practice one 
can construct a ‘ray’ in this manner only approximately, since the 
condition 7 > A has to be satisfied. 

The stage has been reached to make a decision as to whether to 
place verbal emphasis on rays or on wave-surfaces; and we choose 
the former. The reason for this is as follows. Our primary interest 
centres around discovering the over-all properties of any given 
generic type of optical system, the general character of its imagery, 
and the way in which this depends upon the physical symmetries 
of the system. The most suitable tool for this purpose is probably 
Hamilton’s method, the foundations of which were laid by Sir 
William Rowan Hamilton in his Theory of Systems of Rays, published 
in the years 1828-37. The very title of his work implies an emphasis 
on rays. This naturally arises from the fact that the whole theory is 
based very directly on Fermat’s Principle, which is a statement 
about a general property of rays; as we shall see. 

The main virtue of Hamiltonian optics lies in its ability directly 
to yield general statements about the over-all properties of optical 
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systems, without any need to inquire into the details of their con- 
struction. Thus, taking the symmetric system as an example—to be 
studied in detail later on in Chapters 4, 5 and 6—it is irrelevant 
whether one has a finite or an infinite number of refracting or 
reflecting surfaces, and it is likewise irrelevant whether these 
surfaces are spherical or not. In short, all one needs to know is 
that the system is symmetric. Analogous conclusions hold whatever 
symmetries, if any, the system may possess. 


2. Fermat’s Principle 


Rather than introduce Fermat’s Principle immediately as an axiom 
we shall motivate it by reference to the elementary laws of refraction. 
To this end recall that in a homogeneous medium light rays are 
straight; and that any such medium is characterized by a certain 
number N, its refractive index. What part this plays we shall see in 
a moment. 
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Fig. 1.1 


Now suppose one has a pair of homogeneous media whose re- 
fractive indices are N and N’ respectively, separated by a smooth 
boundary @. A light ray passes from a point A in the first medium 
to a point A’ in the second, intersecting @ in a point P, the point 
of incidence. AP and PA’ must of course be straight, according to 
what has already been said, but the angle between AP and PA’ 
will in general be different from zero. Now let e, e’ be unit vectors 
in the directions AP, PA’ respectively, whilst p is the unit normal 
to B at P. The laws of refraction can now be summed up in the single 
equation N'e’—Ne = (N’ cos I’—Ncos1)p, (2.1) 
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where J and J’ (the ‘angles of incidence and refraction’) are the 
angles between e and p, and between e’ and p respectively. We see 
at once that the three vectors e, e’ and p are coplanar, whilst taking 
the vector product of both members of (2.1) with p gives Snell’s 


ea N'sin I’ = Nsin I. (2.2) 


We note in passing that if @ is a reflecting surface, one can retain 
(2.1) by adopting the formal device of setting N’ = —N. At any 
rate, one can now use (2.1) to trace a ray step by step through any 
system composed of homogeneous media and reflecting surfaces; 
but this is just the kind of thing with which we do not want to concern 
ourselves now. 

Given any curve @ joining two points A and A’, the optical 
length of @ is defined to be the value of the integral 


" 
y* = [. Nas, (2.3) 


where ds is the length of an infinitesimal element of @, and N is 
the refractive index of the medium at the midpoint of the element. 
Under the conditions illustrated by Fig. 1.1 the optical length of the 
curve APA’ 1S V*(APA’) = N's’ +.Ns, (2.4) 


where s, s’ are the distances between P and A, A’ respectively. 

Granted that AP and PA’ are straight segments, the optical 
length of the curves APA’ is defined no matter where P may be 
located, provided it lies on #. Having calculated V* for some arbi- 
trary position of P we may inquire into the change of V* consequent 
upon P being slightly displaced. Thus 


OV* = N’ds’ + Nés = N’e’.ds’+ Ne.os. 
Since s+s’ is obviously constant it follows that 
OV* = (N’e’—Ne).6s’. (2.5) 
However, recall that P is constrained to lie in &, i.e. ds’ must be 
normal to p. This condition can be accommodated by writing 
ds’ = p x da, 

where 6a is an arbitrary infinitesimal vector. (2.5) becomes 

dV* = [(N’e’ — Ne) x p].da. 
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Accordingly 6V* = o for all allowed variations of P if and only if 
(N’e’ — Ne) x p = 0, 
i.e. N’'e'—Ne = op, (2.6) 


where ois some scalar factor. Scalar multiplication of (2.6) through- 
out by p then gives (2.1) at once. 

We have just shown that the optical length V*(APA’)is stationary 
for all small displacements of P in # if e and e’ satisfy the laws of 
refraction; and that, conversely, the stationary property 6V* =o 
implies that the laws of refraction are satisfied. In other words, the 
condition that V* be stationary with respect to arbitrary small 
variations of the position of the point P in # just singles out that 
path joining A and J’ which is a light ray as it actually occurs in 
nature. ‘The value of V* calculated for this particular path will be 
called the optical distance between A and A’, to be denoted by V. 

The crucial point of the preceding analysis of a special case is 
that it reveals the possibility of stating the laws governing rays in a 
form in which all explicit reference to planes of incidence or angles 
of incidence and refraction is avoided. We may expect that this 
possibility exists also under general circumstances, since one may 
deal with rays passing through a succession of homogeneous media 
in much the same way as when there are only two. Indeed, this 
generalization is almost trivial ; and then one can think of an inhomo- 
geneous medium as stratified into an indefinitely large number of 
homogeneous layers, the mutual boundary between any two of them 
being smooth. In short, the time has come to state Fermat’s Prin- 
ciple, which is, after all, nothing but a generalization of the results 
obtained in the special situation contemplated above. 

In terms of a Cartesian set of coordinates x, y, 2 let A(x, y, 2) 
and A’(x’, y,’ 2’) be any two points in an optical medium. (Note that 
any optical system K is just an ‘optical medium’.) Let @ be a curve 
joining A and A’. Recall that the optical length of @ is 


a 
V*(A, A’) = i Nas, (2.7) 


where ds is an element of arc of @, N is a prescribed function of 
x, ¥, Zand the integral is of course extended along @. Then we have 
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Fermat’s Principle: The ray joining any two arbitrary potnts 
A, A’ is determined by the condition that its optical length be 
stationary as compared with the optical length of arbitrary 
neighbouring curves joining A and A’. (2.8) 

Fermat’s Principle is a particular example of a variational prin- 
ciple with fixed end points, for one is comparing a property of 
different curves all of which are constrained to pass through a pair 
of fixed points. It should moreover be noted that all comparison 
curves must lie within a certain small neighbourhood of any one of 
them. This restriction can be motivated physically by the following 
example. Let A and JA’ lie in a homogeneous medium, but let a 
plane mirror be also present somewhere, so that one has two pos- 
sible rays, i.e. the straight line Z joining A and A’ as well as the ray 
#* reflected at the mirror. Certainly the optical lengths of all 
curves joining A and A’ and lying in a sufficiently small neighbour- 
hood of & are greater than that of Z, and likewise the optical lengths 
of all curves joining A and A’ and lying in a sufficiently small neigh- 
bourhood of &* are greater than that of Z*; yet the optical length of 
Z is less than that of Z*. Therefore, were one to allow arbitrarily 
distant comparison curves in (2.8) the ray Z* would never appear, 
in contradiction with physical experience. In short, V is a relative 
minimum for &* and an absolute minimum for Z. 

The actual value of any quantity when it is stationary with respect 
to some specified class of variations is called an extremum. It will 
be noticed that statement (2.8) of Fermat’s Principle merely implies 
that the optical length of a ray is in fact an extremum of some kind, 
but it avoids any reference to its specific character. In particular, 
it does not require it to be a minimum. One easily sees that the 
optical length of the path APA’ of Fig. 1.1 is in fact a minimum for 
the ray, just as in the example concerning the rays 2 and Z* above 
an absolute and a relative minimum were encountered. However, 
in general the situation is more complex. Take the example of a 
ray from a point A which is reflected at the point P of a concave 
mirror and then passes to a point A’, AP and PA’ being taken as 
straight, the medium being supposed homogeneous. It is not 
difficult to arrange the shape of the mirror to be such that V(APA’) 
is a maximum, granted that only variations of the position of the 
point of incidence P are contemplated. It is, however, obvious that 
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one can find curves joining A, P and A’ of greater optical length: 
one need only consider curves for which AP is no longer straight. 
In other words, V is here amaximum for variations of P alone, but 
a minimum for variations in which P is left fixed. The extremum 
is therefore of the kind of a saddle point. Of course, one can apply 
Fermat’s Principle to points lying between A and P or between P 
and A’ and so infer that AP and PA’ must be straight. This argu- 
ment, however, does not affect the conclusion that V is neither a 
minimum nor a maximum. Evidently one can never have a true 
maximum of V since any curve can always be so deformed as to 
increase its over-all optical length. To some extent the term optical 
‘distance’ is therefore conventional, since under everyday circum- 
stances the (geometrical) distance between two points is always 
the length of the shortest path connecting them. 


Problems 


P.1(i). A two-parameter family of rays is given by the equations 
x = t?,y =at, z = bt. Show that the wave-surfaces constitute a 
family of spheroids. 


P.1 (ii). A congruence of curves is specified by giving the direction 
cosines «, 8, y of the tangent to a curve at the point x, y, z as func- 
tions of x, y, z. Show that if the congruence is normal then 


oy op da oy 0B da\ 
a (2-7) +B (eae) +7 (se oy) =e 


P.1 (iii). (2) This problem is to be treated as two-dimensional. 
Two points, A, A’, have the coordinates (a, b), (a, — 6) respectively 
(a > o). Aray through A and A’ is reflected at the parabola x = ky?, 
the point of incidence P being of course at the origin. Show that 
(with N = constant) the optical length of the path APA’ is a mini- 
mum or a maximum for the ray according as k is less than or greater 
than 4a/(a?+?). . 

(6) Obtain the equation of the reflecting curve which is such 
that V = constant, and show that your result is consistent with the 
conclusion above. 


P.1(iv). Show that in a medium with continuously varying 
refractive index equations (2.1) become d(Ne)/ds = grad N. 


CHAPTER 2 


CHARACTERISTIC FUNCTIONS 


3. The point characteristic V. The idea of the characteristic 
function 


Let K be some optical system, that is to say, an optical medium the 
refractive index N of which is some prescribed function of the co- 
ordinates %, V, 7. As before A(x, y, 3) and A’(x’, y’,3’) shall be two 
arbitrary points in K. Then Fermat’s Principle selects, save in 
exceptional circumstances, a particular curve # from amongst all 
possible curves @ joining A and A’, as we have seen. In other words, 
Fermat’s Principle thus serves directly to associate an optical 
distance V(A, A’) with A’ and A, i.e. a certain definite number which 
depends solely upon the positions of these points. In short, given 
A and A’, V(A, A’) is a definite function of the six variables x’, y’, 
2’, x, y, 2 alone, and we write 


V = V(x", y', 2’, X,Y, 2). (3-1) 


The function V now defined is known as the point characteristic of K. 
The reason for this terminology will be made clear shortly. 

Our immediate task is to compare the optical distance between A 
and A’ with that between neighbouring points A, and Aj, the magni- 
tudes of the displacements ds (= dx, dy, dz) and és’ (= dx’, dy’, dz’) 
leading from A to A,, and from A’ to Aj being supposed sufficiently 
small. @ is the ray between A and A’, #, that between A, and 
A,. Speaking in terms of Fig. 2.1 for convenience, these rays are of 
course independent of the values of the refractive index to the right 
of A’ and Aj; on the one hand, and to the left of A and A, on the 
other. In other words, we are at liberty to pretend that the refractive 
index is constant in these regions, its values on the left and right 
being equal to those at A and A’ respectively, N and N’, say. If 
Band B’ are two distant points so located that & is part of the ray 
SF joining them, then BA and A’B’ will both be straight. Finally, 
let A, be the curve consisting of #, together with the two straight 
lines BA, and A} B’. 


[ 8] 
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Observe now that, by construction, will coalesce smoothly with 
S when ds and ds’ are allowed to tend to zero. F is thus a curve 
neighbouring to ¥ of the kind contemplated in Fermat’s Principle. 
It follows that, since ¥ is a ray, the optical length of “ is (to the 
first order of small quantities) the same as that of Y. One therefore 


has 
V(B,A)+0(A,A')+ 0(A’, B’) = O(B, Ay) + W(A,, Al) + V(AL, B’). 
(3.2) 


B' 


Fig. 2.1 


Now, if e, e’ are unit vectors in the direction of Z at A and A’ 
respectively 
V(B, A,) — V(B, A) = Ne.6s, 


V(A’', B’)—V(Aj, B’) = N’e’.6s’, 
so that (3.2) becomes 
V(A,, Aj) — V(A, A’) = N'e’.ds’ — Ne.os. (3-3) 
Both terms of the left-hand member of (3.3) are optical distances, 
so that, recalling (3.1), one has just the difference 
V(x! + dx’, y' + dy’, 2° +02’, «+ dx, y+ dy, 7+ 62) 
— V(x", y’, 3, &, y, 2). 
The components of the vectors e and e’ are respectively the direction 


Io HAMILTONIAN OPTICS 


cosines a, f, y and a’, 6’, y’ of the tangents to the ray at the points 
in question. Written out in full, (3.3) becomes 
eV OV aV oV OV oV 


OV = 7 70x tay +5768 +3, Out by oe 


= N"(a! dx! + B' dy’ + y'dz')—N(adxt Body+yéoz). (3-4) 


Since this must hold identically for all values of the six coordinate 


differentials dx’,..., we have the basic equations of Hamiltonian 
optics: ei av ie av Wiehe a 
= ad 7 dy"? ¥ = oz” ’ 
3:5 
aV eV eV 
Na=—~, INE as IN rea 


Hitherto x, y, sand x’, y’, 2’ have been the coordinates of two points, 
referred to the same set of Cartesian axes x,y, 7. However, (3.5) 
remains valid if x, y, x are referred to one set of axes X, ¥, 3, whilst 
x’, y’, 3’ are referred to a different set of Cartesian axes, x’, y’, 3’. 
The direction cosines of course then relate in each case to the 
appropriate axes. 

If K is some optical system, one conventionally speaks of the 
region in which the object is situated as the object space, and un- 
primed symbols will always refer to this. Primed symbols on the 
other hand will always refer to the image space, i.e. the region in 
which one inquires into the disposition of rays from the object 
after their passage through K. We shall suppose throughout that 
the refractive index is constant in the object space and in the image 
space. This restriction is by no means essential, but it is very con- 
venient, and is appropriate to the majority of cases encountered in 
practice. It has the advantage that the intial ray and the final ray 
(i.e. the parts of any given ray # which lie in the object space and 
image space respectively) are straight lines. 

Now suppose that the form of the function V, appropriate to the 
system K, is known. Then, given that some initial ray passes 
through the point A(x, y, 2), and the final ray through the point 
A'(x’, y’, 2’), the equations (3.5) immediately provide the directions 
of & at A and A’. One therefore knows exactly all the data x’, y’, 2’, 
B’, y’ of the final ray which correspond to the data x, y, 3, 6, Y 
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of the initial ray. In other words, given only the form of the function 
V, the particular correspondence established by K between initial 
and final rays is available. This correspondence completely charac- 
terizes the geometrical—optical behaviour of K; so that it is natural 
to refer to V as a characteristic function. 'The possibility of, as it 
were, summing up the properties of any given system in a single 
function has of course very great advantages in optical theory. 
Even the very existence of a characteristic function, coupled only 
with assumptions of a general kind (such as assumptions about 
differentiability) leads to interesting consequences. 

As we have seen, V hasa simple physical meaning: it is the optical 
distance between pairs of points, expressed as a function of the 
coordinates of these points. For this reason it is often called the 
point characteristic, a terminology which at the same time serves to 
distinguish it from the alternative characteristic functions yet to 
be considered. , 

When investigating the properties of optical systems by using the 
point characteristic one must always bear in mind that (3.5) cannot 
be applied in situations in which there exist several rays connecting 
the points A’(x’,y’, 2’) and A(x, y, 2). For example, if these two 
points are the foci of an ellipsoid of revolution, all rays from the 
first pass through the second after reflection at the ellipsoid, granted 
that the interior medium is homogeneous. The optical distance 
between the foci (calculated for reflected rays) is in fact constant; 
and it is quite obvious that in this situation the equations (3.5) 
become meaningless. Quite generally, the use of the point charac- 
teristic is precluded if there exists a set of rays the members of which 
pass through both A and A’. 

Inspection of (3.5) reveals that V must satisfy both of the dif- 
ferential equations 


eV\? (aV\2  (aV\? _ y 1 
() +(5) +(e) og 


(i) +(e) +(e) 9 


In particular, these are satisfied, in the case of a homogeneous 
medium of refractive index N, by 


V = Nix’ —x)?-+(y’ y+ (2 —2)"}. (3-7) 
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It is virtually impossible to obtain the exact form of the point 
characteristic in all but effectively trivial cases, and in practice one 
has to be satisfied with approximations of one kind or another. This, 
however, is fortunately of little concern to us here since we are 
mainly interested in general properties of systems, and these can be 
discussed without knowing the explicit form of the characteristic 
function. The following analogy may be informative for the reader 
who has some acquaintance with quantum mechanics. Thus, a 
great deal of information can be gained on the basis of known 
symmetries of the Hamiltonian H of an atomic system, even 
though the explicit form of H may be unknown. This situation is 
similar to that which exists in Hamiltonian optics, as we shall see 
in due course. 


4. The angle characteristic T 

In theoretical investigations the use of the point characteristic 
occasionally leads to irrelevant difficulties, brought about, for ex- 
ample, by the need to observe the restriction to which attention 
was drawn just prior to equations (3.6). When such a situation 
arises the use of some alternative characteristic function may be 
appropriate. Accordingly, recall that according to (3.4) the total 
differential of the point characteristic is 


dV = N'(a'dx' + B'dy’ + y'dz')— N(adx+ Bdy+ydz). (4-1) 
Let U=N(ax+fyt+yz), U' =N’(a'x'+ B’y'+'2’), (4.2) 
and recall that N and N’ are both constant. Then, using (4.1), 
d(V—U'+U) = —N'(x'da' +y'dp’ + 2’dy’)+ N(xda+y df +2dy) 
—N'[(y' — B'x' [o') dp’ + (2' —y'x'/o')dy’] 
+N[(y— Bula) dB + (2 — yx/o) dy). (4-3) 
Let the function whose total differential appears on the left be 
expressed in terms of the independent variables /’, y’, 8, y, and 


denote it by the symbol T. Then, by inspection of (4.3), 


/ , fot i oT 7 , foe f oT 
Ny BHI) = = a ais N'(z I = ai 


Ny frie) =", Ne-ysja)=-F 4) 
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These equations are precisely analogous to equations (3.5). In 
fact, suppose the form of the function T to be known. Then, if one 
direction in the object space and one direction in the image space 
be prescribed, equations (4.4) immediately provide the equations 
of the initial and final parts of that ray which has these directions. 
T evidently completely characterizes the geometrical-optical 
properties of K, so that it is also a characteristic function. In view of 
the fact that it must be given as a function of direction cosines, it is 
known as the angle characteristic. 

Here again one has to bear one exceptional situation in mind. 
It arises when a set of mutually parallel initial rays is transformed by 
K into a set of mutually parallel final rays; for obviously the mere 
specification of the directions of these cannot lead to the selection 


a] 


A'(x', y’, 2’) 


Fig. 2.2 


of unique initial and final rays. The kind of system now contem- 
plated is called afocal or telescopic; so that if we are confronted with 
such a system the use of the angle characteristic is precluded. 

The angle characteristic has a simple physical interpretation. 
Thus, let the normal to the initial ray drawn from the origin of the 
axes in the object space meet the ray in the point Q; a point Q’ being 
defined analogously in the image space. Then U/N is the distance 
between Q and A, and likewise U’/N’ is the distance between Q’ and 
A’. Hence T is the optical distance between Q and Q’. It should be 
remarked that when N and N’ are not constant one easily convinces 
oneself that T has to be defined as a function of the components of 
the ‘ray vectors’ Ne, N’e’ rather than those of e, e’; and the geo- 
metrical interpretation of T given above no longer obtains. We 
shall not, however, concern ourselves with these complications. 
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The angle characteristic can be obtained in explicit, closed form 
for certain simple systems. By way of example, consider a system 
consisting of a single spherical refracting surface Y, i.e. two homo- 
geneous media, the boundary between them being of constant curva- 
ture 1/r. The geometrical situation is shown in Fig. 2.3. The 
origins of axes B, B’ lie on a diameter of the spherical surface of 
which # is a part, at distances g and q' respectively to the right 
of the pole A, of Y. C is the centre of curvature, so that the line PC, 
of length 7, is normal to.%.Then one can show that 


T = (N'—N)r{x —2x[9 + (1-8)4(1 OF — 1} 
~N'(r—q)(1-HE +N G(1-O8, (4-5) 


Q' 


Fig. 2.3 
where E= Pty", g= PR +7, \ (4.6) 
C= B+, «= NN'|(N’—N) - 


Most situations encountered in practice are of course a good deal 
more complex, and one has again to be content with approxima- 
tions. 


5. The mixed characteristics W, and W, 
At times it proves convenient to employ one or other of a pair of 
characteristic functions which are, as it were, ‘intermediate’ 
between V and T. Thus, in the notation of Section 4, let 
W,=V+U. 
Then 
dW, = N'(a'dx' + B'dy’ + y'dz')4+ N[(y — Bx/a) dB 
+(s— yx/a) dy}. 
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Thus, if W, be regarded as a function of the five independent 
variables x’, y’, 2’, 8, y, one has 


1 _ OW, ppp, OW 1 _ OW, 
Nia’ =, he ier N'Y’ = Gy» 
5.1 

_ OW, _ Wy, 


Clearly W, is a characteristic function, and we shall call it the first 
mixed characteristic. It must evidently satisfy the one differential 


equation oW,\2 (aW,\2_ (aW,\? 
ee | ae fe ae | ee 
0x oy Oz 


Geometrically it is the optical distance between Q and A’. Here 
again we have one situation in which the use of this particular 
characteristic function is precluded, namely when there exists a 
set of rays through A’ the members of which are mutually parallel 
in the object space. 

Without further ado we now define the second mixed character- 
istic W, to be the optical distance between A and Q’, 


W,=V—-—U’', 
regarded as a function of the five variables £’, y’, x, y, z. One has 


t , roe , ow. t t tf , ow, 
N(y'—f'x'[2")=-TB, Ni -y'x'|a’) = - 


49 


Op” Y 
__ MN, _ MM, _ MW, 
Na=—-", N£ = — By ; Ny = be (5.2) 


In the context of W, one has to exclude the situation in which 
there exists a set of rays through A the members of which are 
mutually parallel in the image space. 

Twelve further mixed characteristics may be defined after the 
fashion above (see towards the end of Section 6). These rarely occur 
in practice, though we shall have occasion to use one of them in 
Section 92. In conclusion, attention must be drawn to the fact 
that, until we come to Chapter 10, all rays through a given system 
are taken to have a definite, single wavelength. 
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6. Reduced distances. Developments of notation. Further 
characteristic functions 


The ubiquitous appearance of the constants N and N’ in the various 
equations just considered suggests their removal by some simple 
convention. This may be achieved by the introduction of reduced 
distances. It amounts to this: that if A, B are two points in the 
object space and s is the distance between them, then henceforth 
we agree to use the same symbol s for the reduced distance between 
A and B, this being defined as N times the (geometrical) distance 
between them. An analogous convention is to be observed in 
the image space, N’ of course replacing N. The refractive in- 
dices in equations (3.5) now no longer appear explicitly, since 
x’ replaces N’x’, and so on for all other variables which have 
the character of distances. Similarly N and N’ will be absent 
from equations (4.4), (5.1) and (5.2), and throughout our work 
N and N’ will hardly ever appear explicitly again; see, however, 
Section 102. 

Next, it may have been noticed that V is a function of six variables, 
W, and W, functions of five variables, whilst T is a function of only 
four variables. However, the variables x and x’ are in effect redundant 
in the sense that in any particular situation their values may be 
taken as fixed. To see this, notice first that in the context of the 
point characteristic we have hitherto specified all the coordinates 
of two points A and A’ to select a particular ray. However, choose 
now in the object space a fixed plane &, called the anterior base- 
plane, and likewise choose in the image space a fixed plane # called 
the posterior base-plane. As a matter of convenience let the origins 
and orientations of the coordinate axes be chosen in such a way that 
the equations of the base-planes are ¥ = o and x’ = o respectively. 
The point characteristic V is now explicitly a function of only the 
four variables y’, 2’, y, 2, i.e. we consider only the optical distances 
between points A(y, 2) of the first base-plane and points A’(y’, 2’) 
of the second. Nevertheless, by giving a set of values of these vari- 
ables a particular # will have been selected. The situation is simply 
that whereas originally the dependence of V on x and x’ was known 
explicitly, it is now contained only implicitly in the form of the 
function V(y’,2',, 2). This is good enough for most purposes, 
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and in any event the ‘complete’ function V(x’, y’,2’,x, y, 2) can 
be recovered if necessary (cf. Section 35). 

As regards W, and MW, their explicit dependence on x’ and x 
respectively will no longer be given. Finally, with our choice of 
coordinate axes, wherever the variables x’ and x appear in the left- 
hand members of the equations (4.4), (5.1) and (5.2) they are to be 
replaced by zero. Inshort, asa result of the various conventions which 
have been introduced we have the four sets of equations 


__Vv , wv eV. 
dy” és” B ay’ i oe oz ’ (6.1) 
oe ee are ee ee 
S ap” z= oy” i ap’ ~ Oy? . 
_m mh Mm em 
B= oy ee Y= 6B? ee Oy? (6.3) 
eM, Mm. My 
y AB”? s ay"? p=- dy’ ti oz * (6.4) 


At this point we may conveniently insert a remark concerning 
the further characteristic functions referred to at the end of Section 
5. Recall that we arrived at W, for example, by contemplating the 
total differential of the function V+ U(= V+ Py+yz). There is, 
however, nothing to prevent us from defining, say, the function 
Wy’, 2',y, Y) = V+yez. Then 


,_ OWA |, _ OW, 


ae 7 ee ie Spa = Gy? 
and W, is a characteristic function. One has altogether sixteen 
possibilities: any particular choice corresponds to regarding as 
independent variables any two taken from amongst y’, 2’, 6’, y’ 
together with any two taken from amongst y, z, £, y. We note 
parenthetically that the list of formal possibilities is still not 
exhausted; but the discussion of these will be deferred to Sections 
36 and 37. 

Evidently, if F stands generically for any one of the alterna- 
tive characteristic functions introduced hitherto, each of them de- 
pends upon an appropriate set of four independent variables, to 
be called ray-coordinates, and these may be uniformly denoted by 
q;(t = 1,..., 4); 


2 BIT 
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primed coordinates always preceding the unprimed. The deriva- 
tives of F then yield in each case four variables which may be 
denoted by p;(¢ = 1, ..., 4); so that 


oF. 
pera, (¢ = 1, ...5 4). (6.5) 


At times p, will be called the conjugate of g,. Note that, for example, 
the conjugate of £ is y, whereas the conjugate of y is — 6. Since now 


4 
dF = a p44: (6.6) 
one has the six conditions of integrability 
Op, Ps 
Ha =O 6. 
0g, 84: (6.7) 


on the linear differential form on the right of (6.6). 


4. The idea of the aberration function 


In the most general terms, the design of an optical system K which 
is intended for some specific purpose is so arranged that K will 
produce, as nearly as possible, an image of a certain desired charac- 
ter of the kind of object being contemplated. The degree of success 
achieved in this endeavour depends of course upon limitations 
of both a theoretical and a practical nature. At any rate, one is 
usually (though not always) confronted with the problem of giving 
K a structure such that all the rays from any one point O of a set of 
points in the object space will pass through a corresponding point 
O’ in the image space, the coordinates of O’ depending in a pre- 
assigned way on those of O. By way of example, a very common 
situation is this: one desires K to produce a sharp image of a plane 
object and, moreover, this image is to be geometrically similar to 
the object. As already remarked, such ‘perfect’ imagery cannot in 
general be achieved for one reason or another. In other words, 
if %’ is the plane in which the image is intended to be formed, 
rays from any point O of the object will, in general, fail to 
pass through the appropriate point O’ of .%’. In that case one says 
that ‘the imagery of K is imperfect’, or that ‘K is afflicted with 
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aberrations’. If any particular ray through O in fact intersects 9’ 
in Oj, the displacement e’ from O’ to O; is a measure of the aberration 
of the ray, and e’, regarded as a function of the variables specifying 
particular rays, characterizes the aberrations of K. Of course, the 
desired imagery of K may be of a different kind, e.g. one might want 
K to be such that it produces a rectangular image of a square object 
(anamorphotic systems), or one might wish a plane object to have 
a sharp image lying on some surface other than a plane. 

Whatever the actual situation may be in detail, we shall take it for 
granted that one can specify an deal characteristic function Fy, such 
that zf K had this characteristic function then its imagery would 
be precisely that which one wants it to be. The actual characteristic 
F will in general not be Fj, as discussed above, and we write 


F=h+f. (7-1) 


Then f is called the aberration function ; and all aberrations of K will 
be absent if f = o. (See also the end of Section 36.) 

It may be useful to present a specific example illustrating the 
preceding remarks. Let it be desired that a system K produce a 
sharp, undistorted, plane image in -¥’ of a plane object in.%. The 
anterior base-plane #@ will be taken to coincide with %, whereas 
the posterior base-plane #’ is chosen to be parallel to .%’, the equa- 
tion of the latter therefore being x’ = d’, where d’ is some constant. 
A ray through the point O(0, y, z) of ¥ intersects %’ in the point 
O’(d', Y’,Z’). The geometrical similarity of object and image 
implies that there is a constant m such that 


Y'=my, Z’=mz (7.2) 


for all rays through O, the orientation of the ’-axis having been 
suitably chosen. (‘The coordinate-system in the image space may 
possibly have had to be taken as left-handed.) The constant m is 
the reduced magnification associated with ¥% and 4%’. The actual 
magnification refers to the ratio of corresponding unreduced 
coordinates, and so has the value Nm/N’. 

Now the optical distance from O to O’ is the same for all rays 
through O and so depends on y and 2 alone: 


V(O,0') = g(y, 2). (7-3) 


2-2 
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On the other hand, if a ray through O and O’ intersects #’ in the 
point D’(y’, 2’), then 


V(O, D') = VO, O')—V(D', 0’), 


whence 
Ki(y's 3552) = gy, 2) — [d+ (y’ — my)? +(2’ — mz)? 8, (7-4) 


where (7.2) has been used. Thus, if K has a point characteristic of 
the generic form (7.4), where g is any function of y and 2, its imagery 
will have the desired properties. Whether or not a system of some 
generic type (to which one might be restricted in practice) can actu- 
ally have a point characteristic of the form (7.4) is irrelevant. 

We have introduced the aberration function in unusually general 
terms in order clearly to bring out the fact that the form of the aber- 
ration function f depends on the form of F, and the latter is not 
absolutely prescribed, i.e. to the extent that it depends on the pro- 
perties of K which one happens to aim at. Thus, in the case of the 
(axially) symmetric system, to be described in detail in Chapters 4- 
7, one tends to think of Fj as corresponding to the existence of a 
plane, undistorted image of a plane object, the image lying in the 
‘ideal image plane’. On the other hand, in other situations, for 
example imagery by systems of cylindrical refracting surfaces, 
such a ‘natural’ choice of image plane is not at hand. In any event, 
even in the symmetric case one might want the image of a plane 
object to lie on some curved surface; see, for example, Section 62. 
Further, if one considers imagery by light which is not monochro- 
matic, the system will not produce a sharp image lying in a particular 
image plane for all colours simultaneously, so that F) will corre- 
spond to a choice of position of receiving plane which is inevitably 
arbitrary to some extent (see Section 100). 


Problems 


P.2 (i). A system K consists of two homogeneous media in mutual 
contact, the plane ¥ = a being the boundary between them. Choos- 
ing the coordinate axes suitably, carry through as far as possible the 
determination of the point characteristic of K. (Note that this 
involves the solution of an algebraic equation of the fourth degree.) 
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P.2 (ii). Why would you expect the result (4.5) to become meaning- 
less in the limit 7 -> 00? 


P.2 (iii). Obtain the result (4.5). 


P.2 (iv). Show that the first mixed characteristic of the plane re- 
fracting surface ¥ = 0 is 


W, = x'[N?— N26? + y?)]8 + N(By' +72’). 


P.2(v). Find the angle characteristic 7, which corresponds to the 
point characteristic % given by equation (7.4). 


CHAPTER 3 


SYMMETRIES AND REGULARITY 


8. Symmetries of optical systems and invariance properties 
of characteristic functions 


Many—perhaps most—of the optical systems encountered in 
practice exhibit symmetries of one kind or another. ‘That a system 
K has a certain symmetry means that there exists an appropriate 
displacement of K, the end result of which is a system indistinguish- 
able from K. This general definition may be clarified by means of 
some specific examples. They will serve at the same time to eluci- 
date the somewhat formal sense which may have to be attached 
to the phrase ‘displacement of K’. 

Our first example is that of axial symmetry. In this context the 
displacement to be contemplated is a rigid rotation. However, for 
the sake of verbal uniformity, we imagine the elementary parts into 
which K may be thought of as subdivided all to be rotated through 
the same angle 0 about some line 7. If can be so chosen that the 
resulting system is indistinguishable from K, independently of the 
value of 0, K is axially symmetric, and .¥ is its axis (of symmetry). 
If K includes a diaphragm, or stop, which limits the passage of rays 
through K, it is preferable not to regard it as being a part of K in the 
present context. Without this convention the inclusion of a non- 
circular stop, for instance, in an otherwise axially symmetric system 
would suffice to destroy this symmetry; but it would do so in a 
somewhat trivial way. It is now evident that if the ¥-axis of a Car- 
tesian system is the axis of K, then the refractive index depends on 
¥y and 3 only in the combination y? + 2°. 

As a second example, suppose that K is plane-symmetric, i.e. 
has a plane of symmetry, which we may here take to be the plane 
% =o. The displacement appropriate to this situation is of a formal 
kind. One has to imagine all elementary parts of K to be so displaced 
that after the displacement the part of K lying to one side of the 
plane of symmetry is the mirror image of itself as it was before 
the displacement. (Of course, the plane of symmetry is the ‘mirror’ 

[ 22 ] 
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here.) Plane-symmetry therefore demands that N(x, ¥, 2) be in- 
dependent of the sign of 2. 

These two examples will suffice to illustrate the notion of sym- 
metry. It should be borne in mind that a given system may have 
several symmetries simultaneously; and various cases of interest 
will be examined in subsequent chapters. 

We come now to acrucial point in the argument. It consists of the 
observation that to every symmetry of a given system there corre- 
sponds a substitution of the ray-coordinates which leaves the point 
characteristic invariant. This means that on replacing y’, 2’, y, 2 
by certain functions of these variables, the resulting function is 
just V(y’, 2’, y, 2) again. To see this, let A(y, 2), A’(y’, 2’) be two 
points lying in the base-planes; and let them go over into the points 
A*(y*, 2*) and A*’(y*’, 2*’) as a result of the displacement appro- 
priate to the given symmetry. The base-planes are, in the present 
context, to be regarded as ‘parts of the system’ so that they take 
part in the displacement. Evidently A* is intended to refer to the 
object space, as the notation indicates, so that it may correspond to 
one or other of the undisplaced points A, A’, depending on the 
particular symmetry being contemplated (cf. Section 53). At 
any rate, the point characteristic is the same function for the original 
as for the displaced system, these being indistinguishable from each 
other. Since V(A, A’) is obviously equal to V(A*, A*’), the relation 


V(y*", a, y*, 2%) = V9", 2", 9,8) (8.1) 


must therefore hold identically, the coordinates on the left being 
known functions of those which appear on the right. (8.1) is just 
the formal expression of the invariance of V which was to be 
demonstrated. If one considers other characteristic functions the 
situation is not essentially different, though in the context of the 
mixed characteristics one may have certain complications, arising 
from the different geometrical character of the coordinates in the 
object space on the one hand, and those in the image space on the 
other. At any rate, we shall deal with problems relating to special 
cases as we come to them. . 

As a particularly simple example of (8.1) we return to the case 
of plane-symmetry. Then A(y, 2) evidently goes into A*(y, — 2), 
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and A’(y’, 2’) goes into A*’(y’, — 2’). (8.1) therefore reads 


V(y’, —2',¥, —2) = Vy’, 2’, 9, 2). (8.2) 
In other words, if K is symmetric about the plane 3(= 2’) =o 
its point characteristic must be invariant under the simultaneous 
reversal of sign of z and 2’. 

An interesting conclusion may be drawn when one has a con- 
tinuous symmetry group, i.e. when the displacements correspond- 
ing to the symmetry in question depend continuously upon a 
parameter w. In (8.1) the starred variables are then continuous 
functions of w, and we can always arrange the value w = 0 to corre- 
spond to the identity between unstarred and starred variables. For 
infinitesimal w there exist functions s, t, together with their primed 
counterparts, such that 


y*¥ = yt+os, BP = el, 2.5% (8.3) 


Inserting these in (8.1) and expanding the left-hand member in 
powers of w, one obtains, upon retaining only terms linear in o, 
the identity 


Stl gota tig) em 8.4) 
* ay" Oa" dy Ozp i (8-4 
Recalling that w can be chosen at will it follows from this, together 
with (6.1), that Bis’ +y't! = Bot yt. (8.5) 


In the case of axial symmetry, for example, one has w = 0, s = 2, 
t= —y, s' = 2’, t’ =—y’ (cf. equation (12.1)), so that (8.5) then 
becomes Pe ee ee (8.6) 
The quantityj = £z— yy has therefore the same value in the object 
space as its primed counterpart 7’ has in the image space. Any 
quantity which has this property is called an optical invariant. 
In particular, the existence of the optical invariant 7 is evidently a 
consequence of axial symmetry alone. 

If a system has several symmetries there will be several simul- 
taneous identities of the form (8.1), and we shall have occasion to 
deal with some of the more interesting possibilities later on. The 
central point is that any such identity represents a generic restriction 
on the form of the characteristic function, which immediately brings 
with it a corresponding characterization of the imagery produced by 
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the system. Herein lies the power of Hamilton’s method, especially 
when a certain generic assumption of regularity is satisfied. It is 
this, therefore, to which we now turn our attention. 


9. Regularity 
Whichever of the alternative characteristic functions of a given 
system be chosen, we may be sure that in almost all cases of practical 
interest it will be an exceedingly complicated function of the vari- 
ables on which it depends. In these circumstances one is forced to 
devise approximations of some kind. Expansions in power series 
immediately spring to mind; and in this context the idea of regu- 
larity arises. 

For convenience let us refer to the pair of sets of coordinate 
axes X, y, Z and x’, y’, 2’ as a coordinate basis. 'Then we have the 
following definition: 


A system ts called regular with respect to a chosen coordinate 
basis if F can be written as a power series in the ray-co- 
ordinates q;. 


It should be carefully noted that regularity is a joint property of 
the system and the coordinate basis. By way of illustration, let K 
consist of a single refracting surface of revolution. Specifically, 
take it to be the paraboloid whose equation, referred to auxiliary 
axes &,9,%, is kk? = (2+ 8%)? (k = constant). If Z is some ray 
through K, consider now alternative coordinate bases such that 
(i) the X-axis lies along the initial ray and the x’-axis along the final 
ray; or (ii) these axes lie along the axis of the paraboloid. Then 
regularity obtains in the first case, but not in the second; yet in both 
cases one has the same optical system as such. Still, in cases where 
a definite choice of coordinate basis is understood, one may some- 
times speak simply of the ‘regularity of the system’. 

Even when it is known that the condition of regularity is satis- 
fied in cases of practical interest, the situation is bedevilled by the 
almost total absence of knowledge concerning the radii of conver- 
gence of the series which are encountered. This is an unfortunate 
state of affairs since the approximation to the actual characteristic 
function represented by a truncated power series is meaningless if 
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the series diverges, and useless if it converges too slowly. If 
F(q; ««-: 9a) converges sufficiently rapidly only within a range of 
values of the q; which is inadequate, one can help oneself by represent- 
ing F as a power series in g,—9,, where the @; are appropriately 
chosen constants. In our language this amounts to going over to 
a new coordinate basis. 


10. Parabasal optics in general 


Accounts of the theory of axially symmetric systems often begin 
with a discussion of paraxial optics, i.e. of the imagery by rays 
which lie in a sufficiently small neighbourhood of the axis .. That 
such rays do in fact exist is an assumption; it is in effect an assump- 
tion of the regularity of the system with respect to a coordinate 
basis such that the x-axis and the #’-axis both lie along ./. The 
generalization appropriate to a system K without symmetries is to 
consider some ray &, through K as given, and to investigate imagery 
by rays which lie in a sufficiently small neighbourhood of the 
base-ray &,. In this more general context we shall speak of parabasal 
optics. There is, of course, the implicit assumption that K is regular 
with respect to a coordinate basis such that the x-axis lies along the 
initial ray and the x’-axis along the final ray. 

To go into a little more detail, let us choose the point character- 
istic. Note that because of our various conventions the base-planes 
are normal to Z,. Regularity implies that 


V =aytayy’ + ay2' + agy+a,z + O(2), (10.1) 


where the a; are constants. Throughout, the symbol O(z) will 
denote terms which are of degree not less than m in the ray- 
coordinates. The limit in which y’, 2’, y, 2 go to zero represents 
coincidence with &,. Using (6.1) it follows that a,,...,a, must be 
zero. Including now terms of the second degree explicitly, we may 
write 


V = dgt thy’? + boy's! + bg y'y + byy'2 + 3552 
+ be3x'y+byz'z+ 4bgy? + bg yz +4b192%+O(3), (10.2) 


where the numerical factors have been included merely for con- 
venience. As far as parabasal optics is concerned the term O(3) is 
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irrelevant, and may be omitted. Then, from (6.1), 
BY = byy' + by2' + by + by2, 
Y' = bay’ +b52' + bey tb72, 
—B = bgy’ + bez’ + dey + bgz, 
—Y = by’ +b,2' + boy + by2. (10.3) 


Parabasal optics is thus characterized by the knearity of the rela- 
tions between the ray-coordinates and their conjugates, there being 
at most ten constants b, which enter into these relations. These con- 
stants, called parabasal coefficients, are determined by the con- 
stitution of the system in hand (for a given choice of #,). In the 
absence of symmetries no restriction in principle is laid upon the 
values which the parabasal coefficients might take, except in as 
far as the very use of V implies that the inequality 


b3b,—b,bg +0 (10.4) 


must hold; for this ensures that to any given initial ray there corre- 
sponds just one final ray. 

In practice one often traces parabasal rays through a given system 
by some step-by-step procedure, starting with chosen values of the 
‘initial variables’ y, z, 6, y, say. Granted the linearity of parabasal 
optics, one thus determines the coefficients of the linear transforma- 
_ y = Ay+ By pr Eye+hy, 

= Boyt kh p+Agz+ Boy, 
f= Cyyt+D,P+G,2+hy, 
y’ = Gay+H,h+C,2+ Dey. . (10.5) 


One will therefore be confronted with stxteen constants, and mere 
inspection will not in general reveal to what extent they are in- 
dependent of each other. However, if we now draw upon the exist- 
ence of the point characteristic we know at once that these sixteen 
constants can all be expressed in terms of the fen constants b,, so 
that in the most general case there must exist six identities between 
the constants defined by (10.5). Probably the easiest way to obtain 
these identities is to use (10.5) in the expression 


dV = f'dy' + y'dz' — Bdy —ydz, (10.6) 
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so regarding y, z, 8, y now as independent variables. The six 
conditions of integrability on the resulting linear differential form 
give just the six required identities. One thus finds 


together with the four homogeneous relations 
A,G,-C,E, = A,G,- C,E3, 
A, H,—C,F, = B,G,—D,E,, 

(10.8) 


B,G,—D,E, = A,H,— Cyf,, 
B,H,-D,, = B,H,—D,F >. 


The existence of these identities is thus a simple illustration of the 
very direct way in which interesting consequences flow from the 
general theory developed above. Moreover, no assumptions of 
symmetry were hitherto made in this section. To illustrate the effects 
of symmetries of K, let us suppose K to be plane-symmetric; see 
Section 8. (Other cases will be considered later.) The base-ray is of 
course taken to lie in the plane of symmetry of K. Then, recalling 
(8.2), inspection of (10.2) shows that in this case one must have 


b, = b, = bg = by = OO. (10.9) 


It will be noticed that if at the same time K has a second plane of 
symmetry normal to the first, no further restrictions on the para- 
basal terms of V are implied. At any rate, the plane-symmetric 
system has at most six parabasal coefficients. As a result of (10.9) 


equations (10.3) simplify greatly, for now 
f’ = byy' + by, Y' = b52' + 6,2, 
; ; } (10.10) 
—B = bsy'+bey, —Y = b72' + byo2. 


If these be substituted in (10.5) one finds at once that only the eight 
constants A,, B;, C,, D, (¢ = 1,2) can differ from zero; and these 
must satisfy the two surviving identities (10.7), 1.e. 


A,D,-—B,C,=1 (= 1,2). (10.11) 
If one defines two quantities, 
A, =SB—-By, A, = 8y—-Pa, (10.12) 
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in terms of two arbitrary parabasal rays, of which the second is 
distinguished by a circumflex, then it is an almost trivial exercise 


to show that Ay=Ay, A, = AL (10.13) 
In other words A, and A, are optical invariants, sometimes called 
Lagrange invariants. Unlike the invariant 7 of Section 8 they relate 
only to parabasal rays; and are, moreover, each defined in terms of a 
pair of such rays. 

Returning to the general case, let.% denote some object plane and 
SF’ some receiving plane, both of these being supposed normal to the 
base-ray. The perpendicular distance of 2’ from the posterior 
base-plane &’ is d’, so that the equation of ¥’ is simply x’ = d’; 
in short, the symbolism is that used in the discussion leading to 
equation (7.4). A parabasal ray through O(y, 2) intersects 7’ in 
the point O’ whose coordinates are 

Y=y'+d'f’", Zo=2'+d'y’, (10.14) 
bearing in mind that a = «’ = 1 in the parabasal limit. It now proves 
convenient to make use of the freedom one still has to rotate the 
axes in the image space about #). One can choose the angle of rota- 
tion so that the resulting value of b, is zero. By an analogous rota- 
tion in the object space one can also always reduce b, to zero. We 
henceforth imagine this to have been done. Then, inserting (10.3) 
into (10.14), with b, = by = 0, we find that 

Y’ = (1+d'b,)y’ +40g +i) 

Z =(14+4’b;) 2’ +d'(boy+5,2). 
Setting exceptional vaues of the parabasal coefficients aside, we 
see immediately that all rays through O pass through a straight line 
in the plane d’ = — 1/b,, and they also pass through a straight line in 
the plane d’ = —1/b,;. One thus has in general two focal lines, the 
distance |b; *—;*| between which is known as the astigmatic focal 
distance. 'The focal lines are mutually perpendicular, but, contrary 
to what is often stated in the literature, they need not be perpen- 
dicular to Z). As a matter of fact, the angles between the focal lines 
and &, cannot be calculated on the basis of the quadratic terms of V 
alone. 

If a sharp image is to be obtained, the condition 


b, =); (10.16) 


(10.15) 
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must be satisfied. Evidently a square grid is then transformed by 
K into a grid of similar parallelograms. These in general become 
rectangles when K is plane-symmetric, since then 


Y' =d'bsy, Z' =d'byz. (10.17) 


One thus has a magnification m, = d’b, in the direction defined by 
Y’ alone increasing and another magnification m, = d’b, in a direc- 
tion at right angles to this. The image will therefore be geometrically 
similar to the object, i.e. it will be undistorted, provided 


b, = b,. (10.18) 


It may be noted that the quadratic terms of the ideal characteristic 
function (7.4) indeed satisfy (10.16) and (10.18). 

The following remarks may be appropriate at this point. If K 
is not regular with respect to the coordinate basis none of the pre- 
ceding equations of this section obtain. This is so, for example, in 
the case of the paraboloid of Section 9 with the second of the co- 
ordinate bases there defined. The reader whose mind is directed 
towards practical realities might be inclined to argue that regularity 
could be restored by replacing the actual refracting surface by a 
small, spherical patch at and near the cusp; and that one can then 
take a base-ray &, along the axis. This much is certainly true. On the 
other hand this procedure is quite useless for another reason. Para- 
basal optics is intended to be an approximation—albeit a crude one 
—in an extended neighbourhood of &p, that is to say, a neighbour- 
hood not so small as to be of merely academic interest. However, in 
the present situation the geometry of the surface of interest does not 
enter into the parabasal optics at all, so that the latter is irrelevant. 
In short, one has gained nothing. 

It should be clear by now how any question concerning parabasal 
imagery can be answered by elementary arguments based on equa- 
tions (10.3) and (10.14), if, in the latter, d’ be regarded as a para- 
meter whose value can be chosen at will. 'The subject will therefore 
not be pursued further, apart from the remark that the preceding 
work could of course have been based equally well on the use of one 
of the other characteristic functions. Using the notation introduced 
at the end of Section 6, the characteristic function, up to terms of 
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the second degree, will be 


4 
P= Ato x Oia Ge MW (10.19) 


4 
whence Pe = dX On G3 (10.20) 
i=1 


and these equations correspond to (10.2) and (10.3) respectively. 

It must be carefully observed that, the notation notwithstand- 
ing, the meaning of the constants b,, depends of course on the 
particular choice of F. We shall frequently make use of the con- 
vention that certain symbols denote quantities of a certain generic 
type, so that the precise significance to be attached to such symbols 
will depend upon the context in which they occur. Whatever slight 
difficulties may be encountered as a result of adhering to this con- 
vention, it is more than compensated for by the way in which it 
enables us to avoid an endless and confusing proliferation of 
symbols. 


1x. Aberration coefficients 


Given a system which is regular with respect to a suitable coordinate 
basis, its aberration function f may be written as a power series: 


ae (11.1) 


Here f, is a homogeneous polynomial of degree n+1 in the ray- 
coordinates, and it is called the aberration function of order n. Its 


generic form is 
nt+1 A 


In = pa p2 D Sune Mb "Gi (11.2) 
A=04=0 v= 


Then the /,,,,,, which are constants of the system for a given co- 
ordinate basis, are called the aberration coefficients of order n. Occa- 
sionally we shall refer to them more precisely as characteristic aber- 
ration coefficients, to distinguish them from the effective aberration 


coefficients to be considered later on. Since a polynomial of degree 


m+s—I 


m in s variables has coefficients, the number of aberra- 


tion coefficients of order 7 is not greater than 


= 4(n+2)(n+3)(n+4). (11.3) 
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This number relates to arbitrary positions of the object surface. 
If the latter is taken as fixed (as is frequently done in practice) all 
terms of (11.1) which depend on the object space coordinates qs 
and qg, alone need not be counted. There aren + 2 of these, and so the 
number 7, is reduced to 


Ne = 4(n+1)(n+2)(n+6). (11.4) 
In this sense there are at most 7, 16, 30, 50, 77, ... aberration 
coefficients of orders 1, 2, 3, 4, 5, ...- Lhe problem in practice is to 


calculate these numbers, given the physical constitution of the 
system; and, as explained at the end of Section 9, one may have to 
do this for several choices of coordinate basis. The computational 
problem is a difficult one, and in practice the sequence (11.1) has 
to be truncated after a very few terms. Fortunately, the number of 
independent aberration coefficients of the various orders is reduced 
by every symmetry the system happens to possess; granted, of 
course, that the base-ray is chosen appropriately. For example, if 
K is doubly plane-symmetric (see Section 88), i.e. has just two 
planes of symmetry, with Z, taken along their common line, then 
one has in place of (11.3) the number 

a 3)(m?+6n+11) (n odd) | 


= (11.5) 


(n even). 


This gives 7, = 0, 19 for the coefficients of orders 2 and 3, whereas 
Nm, = 20, 35 according to (11.3). Even this moderate degree of sym- 
metry therefore reduces the computational task very considerably. 

Every term of (11.2) is said to represent an aberration, so that one 
can speak of the ‘number of aberrations’ rather than of the number 
of aberration coefficients. Now consider the displacement e’ intro- 
duced in Section 7. (It suffices to restrict ourselves to the state of 
affairs contemplated there, since the generalization to more com- 
plicated situations, e.g. to curved image surfaces, is easily achieved.) 
Each of the components of e’ appears as a sequence of homogeneous 
polynomials of degree 1, 2, ... in the ray-coordinates: 


foe) 


cee] 
6 = Gy & => Ge (11.6) 
n=1 n=1 


€,, is induced by the aberrations of order m < n. This means that, 
strictly speaking, one has to distinguish between the ‘aberrations 
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of order s’ and the ‘displacement of order s’, for these terms refer 
to (11.2) and (11.6) respectively. However, in practice one fre- 
quently speaks indiscriminately of the aberrations of order s in 
either case. 

Taken by itself, a particular mth-order term of (11.2), Wy, say, 
gives rise to a certain mth-order displacement e,,(y,). The latter 
depends in a specific way upon the ray-coordinates, and its magni- 
tude depends linearly upon the aberration coefficient which governs 
yf, that is to say, which multiplies it. It is usual to examine the curve 
generated by the points of intersection with the receiving plane of 
a suitably selected one-parameter set of rays from a fixed point O 
of the object. The shape and location of this curve, and the way in 
which it depends upon parameters (such as the position of O) 
which are still at our disposal, completely characterize the aberra- 
tion in hand. The purpose of this procedure is to gain insight into the 
geometrical significance of the various terms of (11.2). More 
detailed investigation reveals that any particular aberration of 
order 7 in a certain sense has as its natural counterpart an aberra- 
tion in every order exceeding 1; and that generally speaking various 
aberrations of a given order can be put into groups, which again 
have their higher-order counterparts; see, in particular, Sections 
21, 22 and 77. 

The preceding remarks are intended to indicate in general terms 
that aberrations, or aberration coefficients, can be usefully divided 
into various classes or types. The precise details of such a classifica- 
tion naturally depend upon various conventions introduced from 
time to time. However, rather than embark on such a programme 
at this stage in the most general terms, it seems desirable to proceed 
directly to the consideration of systems having given symmetries. 
The investigation of the aberrations and their significance in these 
more specialized situations is easier and therefore more readily 
appreciated; the more so as we shall not hesitate occasionally to 
repeat in the more specialized circumstances work we have already 
done in quite general terms. This is particularly true of the lengthy 
treatment of the symmetric system which is intended to serve 
as a kind of prototype. Having gone through much the same routine 
a few times, the reader should experience no overwhelming diffi- 
culties with whatever case comes to hand, no matter how general. 


3 BIT 
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Problems 


P.3(i). Obtain the angle characteristic T for the paraboloid de- 
scribed in Section 9, the second of the alternative coordinate bases 
being chosen. Show especially that T is irregular. (See equations 
(117.2-15).) 

P.3 (ii). Give an example of a wavefront W such that all its rays 
pass exactly (i.e. without restriction to parabasal optics) through 


two (non-intersecting) curves. Hence show that for an arbitrary 
&, the parabasal focal lines are not necessarily normal to Zp. 


P.3 (iii). Show that A,+A, is an optical invariant even when K 
has no symmetries. 


P.3 (iv). A circle in the object plane and with centre on &, has as 
its parabasal image a certain curve @. Determine @ and examine the 
condition that it must not shrink to a point. 


P.3(v). Inthe parabasal region, for fixed x, y, 2, V will be a function 
of x’, y’, 2’ which may be written in the form (10.2), with the term 
O(3) omitted. Obtain differential equations which the functions 
a(x’), 5,(x’), ...,by9(x’) must satisfy. (b, and by may be taken to be 
zero, but their derivatives cannot, of course, be supposed to vanish 
at the same time.) Solve some of the equations. 


P.3(vi). An optical system consists of a homogeneous glass 
cylinder of arbitrary cross-section. What can be said about the 
generic form of V? 


P.3 (vii). Does the system of the preceding problem in general 
have non-vanishing second-order aberrations? If so, how many co- 
efficients govern them? 


CHAPTER 4 


THE SYMMETRIC SYSTEM (PART I) 


12. Definition of the symmetric system 


A system is called symmetric (without qualification) if it has (i) an 
axis of symmetry which has points in common with both the 
object space and the image space (cf. Section 93), and (ii) a plane 
of symmetry which contains the axis. Of course, when both con- 
ditions are satisfied every plane containing the axis is a plane of 
symmetry. On the other hand, mere axial symmetry does not imply 
plane-symmetry. In this context it may be helpful to think about 
a turbine with non-radial blades; see also Chapter 7. 

The displacements corresponding to the present symmetries 
are just those considered explicitly in Section 8. With regard to 
axial symmetry, a rotation of K through the angle 6 takes A, A’ 
into A*, A*’, where now 


y*' = y' cosO—2' sind, 2z* = y’sinO +2’ cos0, 


} (12.1) 


A coordinate basis has been adopted such that the X-axis and the 
x’-axis both lie along the axis W of K, whilst the y-axis and y’-axis 
are mutually parallel. The plane containing these axes will be called 
the meridional plane (or also tangential plane), whilst the sagittal 
plane is normal to this and contains .¥. In the context of the sym- 
metric system the particular coordinate basis just chosen shall be 
understood throughout, so that later on reference to the ‘regularity 
of K’ will be intended to mean regularity with respect to this co- 
ordinate basis. 

As for the second symmetry condition, we may, without loss of 
generality, choose the meridional plane to be the plane of symmetry, 
so that in this case 


y* = ycosO—zsin6, 2z* = ysinf+zcosé. 


yy = 2* = — 2", y* = y, 2* = —gz, (12.2) 
The identity (8.1), i.e. 
V(y*", 3*', y*, 2*) = V(y", 2’, 9,8), (12.3) 
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must now hold for all values of y’, 2’, y, z and 0, for both (12.1) and 
(12.2) taken separately. Precisely analogous equations to these will 
obtain in the context of the other characteristics 7, W, and Wh, 
since under rotations and reflections direction cosines transform 
exactly like the corresponding Cartesian coordinates; whilst the ray- 
coordinates in the object space on the one hand, and those in the 
image space on the other, merely transform amongst themselves, 
i.e. do not get mixed up with each other. 

In the light of types of symmetries to be considered later, the 
traditional terminology, to which we have adhered here, is somewhat 
unfortunate. It is not very sensible to single out a class of systems 
defined by a particular set of symmetries and merely to call the 
members of this class ‘symmetric’. In short, to avoid needless 
ambiguity we can be specific by speaking, in the present context, 
of ‘r-symmetry’ rather than of ‘symmetry’; but we shall do so 
only on occasions where the usual terminology seems altogether 
too inadequate. . 


13. Form of the characteristic function. Rotational invariants 


Equations (12.1) and (12.3) together express the condition that V 
must be invariant under rotations. To investigate the explicit 
consequences of this condition, as regards the generic form of V, 
we may proceed in the following elementary way. First, introduce 
four new independent variables £, 7, ¢, o in place of y’, 2’, y, 2, of 
which the first three are 


E=y242%, gayy' tax, Cat +2? (13.1) 


whilst o can be taken arbitrarily, subject, of course, to the require- 
ment that it cannot be written as a function of £, 7, € alone. We now 
observe that £, 7, € are all separately invariant under rotations, 
i.e. y*2 4 2*2 = y?+ 22 for all 6, and so on. For this reason these 
three quantities are often called (elementary) rotational invariants. 
Next suppose that o could be defined in such a way that it also is 
invariant under rotations. Then, since any function of y’, 2’, y, 2 
can equally well be written as a function of £, 7, ¢, 7, one would be 
able to conclude that every function of y’, 2’, y, 2 is invariant under 
rotations, and this is certainly not the case. It follows that no o 
which is functionally independent of £, 4, ¢ can be rotationally 
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invariant. V therefore cannot be a function of o. In short, we have 
shown that 


V is invariant under rotations if and only tf it depends 
on y’, 2’, y, & through the combinations £, y, € alone. 


It remains to take the condition of plane-symmetry into account. 
At first sight it looks as if this were automatically satisfied, since &, 
y, § are invariant under the substitution x’ > — 2’, —s. Clearly, 
we must be mistaken since, as already remarked, axial symmetry 
does not imply plane-symmetry; or, in other words, K may have a 
built-in screw-sense. The error lies in our inadvertently thinking 
only of single-valued functions of &, 7, €. This point is clearly illu- 
strated by the function j ; 

T= yx’ —2y’. (13.2) 
It is easily shown to be invariant under rotations, and so must be a 
function of &, y, €; in fact, 7 = + (€ — 9). (Incidentally, 7 is not to 
be included under the heading of elementary rotational invariants.) 
Now when 2 and 2’ reverse sign then so does 7, and no ambiguity 
arises. When, however, 7 is regarded as a function of &, 7, € its 
double-valuedness must be explicitly taken into account. Moreover, 
if, for example, V contained a term which depended linearly on 7 
then it would not be invariant under reflections. In short, the con- 
dition of plane-symmetry imposes definite limitations upon the form 
of the dependence of V on &, , ¢. 

These limitations can be stated explicitly when the condition of 
regularity is satisfied; and we now assume that this is the case. 
This means of course that V can be written as a power series in the 
ray-coordinates. We recognize that already because of rotational 
invariance this series cannot contain terms of odd degree, since a 
reversal of sign of all the ray-coordinates is equivalent to a rotation 
through 180°. At this stage of the argument it is convenient tempor- 
arily to introduce polar coordinates in both base-planes: 


y’ =pcos0, 2’=psinb, y=xcosd, z= xsing. (13.3) 


Then V becomes a power series in p and x, a typical term of which, 
X, say, has the form wp“x’ (“+ even), where w is a sum of pro- 
ducts of the trigonometric functions in (13.3). However, 0 and ¢ 
can, in effect, occur in w only in the combination y = 0-4, 
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since under a rotation # and ¢ change by the same amount. In short, 
w is generically a sum of products of cos y and siny. Symmetry 
about the meridional plane now requires the absence of odd 
powers of sin yy, since a reversal of sign of 2’ and z is equivalent to a 
reversal of sign of yr. This means that w can always be written simply 
as a polynomial in cos ¥ alone. Now recall that X arose from making 
the substitutions (13.3) in a series whose terms contained only 
integral powers of the ray-coordinates, so that the same must be 
true of X. Since € = p?, 7 = pxcosw, ¢ = x” it follows that X must 
beasum of products of £, 7 and ¢. Consequently we have shown that 


the characteristic function of a regular symmetric system 
can be written as a power series in the three elementary rota- 
tional invariants. (13.4) 


We have stated this important result in a form which no longer 
explicitly refers to any particular characteristic function F, since 
it is clearly valid for any choice of the latter, bearing in mind the 
concluding remarks of the preceding section. The elementary 
rotational invariants must of course be appropriate in each case to 
the particular choice of F. In terms of the generic notation of 
Section 6 the invariants 


E=qitg, 2=U9stdeds $= +4 (13.5) 
are admissible. Nevertheless these are not the only possibilities, 
for any three linearly independent linear combinations of £7, ¢ 
will do as well. Such combinations will then usually be denoted 
again by &, 7, ¢. In short, we have here an excellent example of the 
convention regarding the repeated use of the same set of symbols 
discussed at the end of Section 6. 

Finally, it should be remarked that the result (13.4) is by no means 
trivial, though it is often taken for granted. Unfortunately some 
discussions seem to imply that it is valid for amy axially symmetric 
regular system. This, however, is clearly not the case (see also 
Section 72). 


14. Paraxial optics 

The work of this section will to some extent be a repetition of that of 
Section 10, but will naturally be much simpler in its details. The 
base-ray #, now coincides with the axis 7 of K, so that we speak 
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of paraxial, rather than of parabasal, optics. We have the freedom 
to use whatever characteristic function we please, and for the time 
being we continue to use V. Retaining only terms of the second 
degree (in the ray-coordinates) we now have 


V = ag t gh E + hoy + dhs, (14.1) 


where a), k,, ky, k are constants of the system. These of course still 
depend upon the exact location of the base-planes, which are for 
the present arbitrary normal planes. (A normal plane is any plane 
normal to .%.) It will be recalled that the anterior base-plane 4 
has the equation X¥ = 0, the posterior base-plane #’ the equation 
%” = 0, in view of the arrangement of the coordinate basis. The axial 
points of # and & will be denoted by B, B’ respectively. 
Comparing (14.1) with (10.2) we have 


b,= 6; bs =b,, bg =dy, b2= b= be =by=0. (14.2) 


This is a situation of great simplicity, paraxial imagery being gov- 
erned by only three constants. Now, using (6.1), or directly from 


(10.3), B' = ky’ +kyy, y= ne 


14. 
—Ba=kgy'+khgy, — y = koa’ +hgz. (4-3) 


We have here the first example of sets of equations which naturally 
group themselves into pairs, and it is convenient to adopt an appro- 
priate notation. For this purpose we take (14.3) as a typical example. 
Thus the first and second pair of equations will be written as 


| B’=ky'+h,y, —B=ky'+hsy (14.4) 
respectively. In effect, we are using a two-vector notation which is 
largely self-explanatory. Any symbol in bold-face type stands for 
two ‘components’, e.g. B for # and y, as above, or e’ for €,, and €;; 
and so on. If desired one may use the usual notation for the scalar 
product, so that, for example, 8.6 = 67+ y?. 

The basic equations of ‘Gaussian optics’ are just the equations 
(10.5) when specialized to the symmetric case: 


y’ = Ay+BB, B’ = Cy+ Dp. (14.5) 

Inserting (14.4) into (14.5) one finds, with c = 1/(k3—k,k,), 
that 

A=-—k,/k,, B=-—1/k,, C=1/ckz, D=—k,/Ro (14.6) 
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Fromthesetheidentity AD-—BC=1 (14.7) 
follows at once. 

Let & be the normal plane x’ = d’ in the image space. A ray 
& intersects it in the point 


Y'=y'+d'p’ =(14+dk,)y'+dky, (14.8) 
cf. (10.15). If Z& is parallel to the axis in the object space, then 
k,Y’ = (d'/e—ks)y. 
Evidently the ray intersects in its axial point F’ if d’ has the value 
dy = ck. (14.9) 


The particular point F’ so defined is the posterior focal point of 
K. It suffices to suppose that # is meridional, in view of the sym- 
metry of K, i.e. to take z = 2’= 0. Its actual (unreduced) distance 
from in the object space is then y/N. Let Z intersect Y’ in a point 
whose actual normal distance from is just y/N again, so that 


Y'/N’ = y/N. 

Then d’ must be chosen to have the value 

dy = c(N’k/N + kg). (14.10) 
The distance dj — dj is, by definition, the (reduced) posterior focal 
length f’ of K, so that } ~ —N'ch,/N. (402) 
The anterior focal length f is defined analogously in terms of a 
ray which is parallel to the axis in the image space, and it turns out 
a PIN? = fine. (14.12) 


The explicit appearance of the refractive indices in (14.11) and 
(14.12) is a nuisance, since if they are not removed forthwith by 
some means or other they will occur time and again later on. 
Accordingly we define the mean focal length f as the geometric mean 
of f and f. Thus 


f= (fs (14-13) 
but the qualification ‘mean’ will henceforth be omitted. Now 
fP=Ny7IN, f= NAN, (14.14) 


whilst the common ratio which appears in (14.12) is just f/NN’. 
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In terms of a terminology in common usage, this ratio is the re- 
ciprocal of the power of K, and f is the reciprocal of the modtfied 
power. Note that (14.11) now reads simply 


f= —chy. (14.15) 

All rays from a point O(y) of @ intersect Y’ in the same point 
O'(Y’) if one chooses pio. (14.16) 
for then (14.8) reduces to Y’ = d’k,y. (14.17) 


This shows that, as far as paraxial imagery is concerned, a plane 
object has a sharp, undistorted plane image. Moreover, the (re- 
duced) magnification associated with # and & is given by 


m = a’ky. (14.18) 


When the object point is at infinity the initial rays are characterized 
by the constancy of 8. Then, from (14.8) and the second member of 


(14.4), Y’ = (1—d'|chs) y’ —(d’ho/Rs) B.- (14.19) 


The image is of course formed in the plane which has d’ = dj; 
and, using (14.9) and (14.15), the zmage height is 


Y’ = fp. (14.20) 


We now write % in place of # to emphasize that it is to be re- 
garded as the object plane, whilst we write %’ in place of Z’ to 
indicate that this plane is conjugate to ¥, i.e. that every point of F is 
transformed by K into one point of 7’, their respective coordinates 
bearing a fixed ratio to each other in the sense of equation (14.17). 
SF" isalso called the zdeal image plane corresponding to.¥, for obvious 
reasons. Similarly, if the point O’ in this plane is conjugate to the 
point O of Y, then O’ is called the ideal image point (of O). 

The system will generally contain a stop somewhere, that is to 
say, some plane screen, normal to the axis, which is provided with 
an aperture of some shape. It is this stop which we suppose to limit 
the bundles of rays capable of passing through K, though in practice 
this limitation is often imposed, at least in part, by other obstacles, 
such as lens rims, in which case one speaks of vignetting. At any 
rate, we suppose the stop (i.e. its aperture) to be circular, concentric 
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with the axis. The paraxial images formed of the axial point of the 
stop by the parts of K respectively preceding and following it of 
course lie on the axis, and they will be denoted by F and E’. The 
normal planes through FE and E’ are called the planes of the (paraxzal) 
entrance and exit pupil respectively, the pupils being the generally 
somewhat ill-defined images of the stop which lie approximately in 
the planes just defined. A family of rays just grazing the rim of the 
stop will generally intersect the plane of the paraxial exit pupil in a 
curve which approximates a circle concentric with the axis. It 
should, however, be borne in mind that cases arise in practice when 
what has just been said is completely false, for example, in the case 
of some wide-angle photographic objectives. 


To recapitulate, we have at hand the following planes: the object 
plane ¥ and its conjugate ideal image plane 4’, the anterior and 
posterior base-planes # and &’, the planes & and &” of the paraxial 
entrance and exit pupils; the axial points of these planes, taken 
in order, being O,, Oo, B, B’, E and E’, whilst F and F’ are the focal 
points ; see Fig. 4.1. In any particular situation some of these planes 
may be mutually coincident, and indeed, will usually be so by 
choice, depending upon the problem in hand, and on the particular 
characteristic function used to deal with it. Thus, in the present 
circumstances & coincides with .%, and we now choose & to coin- 
cide with &’, so that d’ is the fixed distance between E’ and Op. 

We may now slightly extend our previous work. To this end let 
% = q bea plane in the object space, and x’ = q’ a plane in the image 
space. A ray intersects these in the points 


Y,=yt+@®, Yi=y'+q'8’ 
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respectively. Thus 
¥, = —ghkgy’+(1—ghs)y, Yu =(rt+q’ky)y' +7 kay. (14.21) 
The planes in question are mutually conjugate if Y, and Y, stand 


in a fixed ratio to each other for all y’, y, which will be the case if the 
discriminant of the equations (14.21) vanishes, i.e. if 


c1qq' +k,q' —kgqt+i =0. (14.22) 

The (reduced) magnification $ associated with these planes is then 
iven b 4 ' ’ 

ee 8 (Rag ng = Fag’ l(t —heg) (14-23) 


provided neither g nor q’ is infinite. Take the pupil planes as an 
example. Then the distance from O, to E is the analogue of d’, 
and we therefore write d for it. Setting g’ = 0 in (14.22), we find at 


once that ee (14.24) 
whilst, if s is the magnification associated with the pupil planes, 
(14.23) gives pie ie. (14.25) 
This may be compared with (14.18), i.e. 

m = —k,/ky. (14.26) 
From these results one infers easily that 

d’ =(s—m)f (14.27) 
and ae (- -*) f (14.28) 


Recalling (14.16) one can therefore write 
k,=—-1/d’, kpg=m/d', k,=—sm/d’, (14.29) 


with d’ given by (14.27). In this way the paraxial coefficients are 
expressed in terms of s, mand f alone. (14.1) may now be written as 


Pe ee eT | tg 
V = (a—d')— J (E-amy+m)—7e (14.30) 
We note in passing that the pairs of conjugate planes defined by the 


value unity of the actual and of the reduced magnification are 
known as principal planes and nodal planes respectively. 
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It should be clear by now how any question concerning paraxial 
imagery can be answered by appealing to (14.4) and (14.21). 
Moreover, though we proceeded above on the basis of the point 
characteristic, any other characteristic function would have served 
as well. It is hardly necessary to demonstrate this in detail. Indeed, 
it will suffice briefly to consider the angle characteristic; doing so 
chiefly to bring out clearly a certain feature of the notation, which, 
if not explicitly remarked on, might be a source of confusion for the 
unwary reader. 

Accordingly we now begin with the equation 


T = dg t+ ghyE + Roy + dh36. (14.31) 


Here the end of Section 13 must be recalled: £, 7, ¢ now stand for 
B2+y', BB’ +yy', B?+y? respectively, and likewise the signifi- 
cance of the constants k,, ka, ky is different from that which they 
had in (14.1). By (6.2), 


—y’ =k,8'+k,8, y= kB'+,8. (14.32) 


These equations relate to an arbitrary choice of base-planes; 
and they are therefore entirely equivalent to (14.4). We could now 
proceed exactly as before; but the work would differ only trivially 
from that following equations (14.4). 

However, note that in the context of the angle characteristic the 
use of conjugate base-planes is permitted. If, therefore, we now 
choose #& and # to be ¥ and ¥’ respectively, the points O(y) 
and O’(y’) must be conjugate to each other, i.e. y’ must stand in a 
fixed ratio to y, this ratio being of course just the magnification m. 
It follows at once that ice a = Gas 
The posterior focal length is f' =—N’y/Nf’, calculated for a 
meridional ray which is parallel to the axis in the object space. 
Drawing upon (14.14) and (14.32) with # = 0, it follows that 


f= hy. (14.34) 
Therefore T = ay+(f/2m) (mg —2my + ¢). (14.35) 
At this point a word needs to be said about telescopic systems, 


which have f =00. (14.35) shows at once that the theory breaks 
down in this case; as we already know, of course, from Section 4. 


THE SYMMETRIC SYSTEM 45 


It is, however, not necessary to go again through the preceding work 
in detail. Thus, inspection of equations (14.27) and (14.28) shows 
that we must, of necessity, now have s = m; which means that the 
magnification associated with all pairs of conjugate planes is the 
same, say m. (14.22) simplifies to 


kg’ —Rgq+1 =0, (14.36) 


i.e. g’ and g are linearly related to each other. With the usual choice 
of base-planes, equations (14.29) apply as before (with s =m), 
whilst (14.30) becomes 


V = const. — 1/(2d’) ( —2my +m). (14.37) 
One also has the simple relation 
d' = —m'd. (14.38) 


When ¥ is at infinity, and therefore 7’ also, one has a still more 
specialized situation, to which (14.37) and (14.38) of course do not 
apply. One then has to choose any convenient pair of finitely situ- 
ated (non-conjugate) base-points B, B’. V, as given by (14.37), may 
then betaken to refer to these, ifm be interpreted as the magnification 
associated with # and the plane conjugate to it, with a correspond- 
ing interpretation of d’. 

We return to the point of notation which was alluded to just before 
equation (14.31). It is this: in the context of V the image point O’ 
had coordinates denoted by Y’, Z’, whereas in the context of T these 
were denoted by y’, 2’. The reason for this state of affairs is that we 
have agreed once and for all to use lower-case symbols for all ray- 
coordinates and their conjugates. Granted the very convenient 
choice of base-planes made hitherto, the conjugates y’ of —®’ 
simply are the coordinates of points in the ideal image plane. In 
the case of V on the other hand the ray-coordinates y’ relate to 
points in & and this plane must be distinct from -%’; so that 
new symbols are required for the coordinates of points in ¥%’. 
No genuine ambiguity is inherent in this notation. Once again, the 
meaning of a particular symbol depends upon the context in which 
it occurs; a situation which reflects the wide freedom one has in the 
choice of characteristic functions, and of base-planes, once a par- 
ticular characteristic function has been adopted. 
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15. Ideal characteristic functions 


To specify a particular ideal characteristic function Fy we must 
first decide under which circumstances the behaviour of a given 
system K is to be regarded as ‘ideal’; as was emphasized in the 
course of the discussion of Section 7. Accordingly, bearing in mind 
what is perhaps the most common situation in practice, we desire to 
form a sharp, undistorted, plane image of an object lying in a 
fixed normal plane .%. Our immediate task is to determine the 
characteristic function Fy, which K would have to have to exhibit 
the desired behaviour. Although later on we shall investigate the 
properties of systems with particular symmetries with the aid of 
whatever characteristic function appears to be the most convenient, 
we do not want to make any particular choice for the present and so 
determine K, 7), Wy and Wy in turn. 

V, has of course already been considered in a more general setting 
in Section 7. Nevertheless it will do no harm to recapitulate the 
work in the present context. To begin with, we note that since all 
rays from a point O of ¥ are to pass through one point O’ of #’ 
this must be true for paraxial rays in particular. #%’ is therefore the 
ideal image plane and O’ the ideal image point J’ conjugate to O. 
As base-planes we choose.¥ and &’. Then, if a ray through O and O’ 
intersects 6” in D’, 


V(O, D') = V(O, O’)— VD’, O’), 
in the usual notation. Now V(O, O’) is a constant for all rays through 


O, and so depends on y and z only. Since these variables can occur 
only in the combination ¢, we have 


V(O, 0’) = a(S), (15-1) 
where the function 2(€) is not further determined here. Bearing in 
mind that Y’ = my, the required result 


Vo = (6) — (a2 + § —2my + mg) (15.2) 
follows at once. Note that if we write 
BC) = 29+ 38164 O(4), (15-3) 


(15.2) gives 
Va = (So — 4’) — 1/(2d") (E — 2m + mt) +3816, (15-4) 


in agreement with (14.30); and g, is seen to have the value — m/f. 
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When the object is at infinity the choice of % as anterior base- 
plane is clearly no longer permitted. (Physically : all optical distances 
between points of the base-planes become infinite.) Any other 
choice of anterior base-plane is, however, very inconvenient in the 
context of the point characteristic. One can help oneself with a 
limiting process, that is to say, one first takes % to be at a finite 
distance from K (so that m + 0) and eventually allows % to recede 
to infinity (d > 00). To avoid formal complications we therefore 
introduce independent variables y, in place of y, where 


yi = my, (15-5) 
so that y,, 2, are simply the coordinates of the ideal image point. 
4 and € then refer to y, rather than to y, e.g. € = y?+ 22. Of course, 
one now has 8 = —m2@V /dy,. In place of (15.2), 


Ky = a(6)—(d? +n), (15.6) 
where u=&—anté, (15-7) 


and g is the same function of the new ¢ as g was of the old. When 
m —> O infinities now only occur in g(¢), but this function does not 
contribute to the displacement. In short, the formal device just 
introduced allows one to discuss aberrations on the basis of V even 
when the object is at infinity; so that this case, which commonly 
occurs in practice, need not be treated separately. 

We next turn our attention to 7). Here we have to proceed quite 
differently. As base-planes we now take ¥ and .¥’, so that y’ = my 
for all rays. This means that 7 must satisfy each of the simple 
differential equations 

2 


op’ +™ op => By" Dg oo (15.8) 


The first of these shows that @ and f’ can occur in J) only in the 
combination mf’ — £, whilst the second likewise shows that it can 
depend on y and y’ only through the combination my’ — y. Since, 
however, 7, must be a function of rotational invariants, it follows 


that qi = e{(mp’ — fy + (my’ = y)*, (1 5-9) 


where g is a function of one argument. The form of (15.9) suggests 
the introduction of new rotational invariants in place of £’2+ y’, etc. 
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Asymmetrical choice is 
& = (sh"— f+ (sy'— 7)’; 
9 = (sB’— B)(mB"— B)+(sy'—y)(my'—Y)¢ (45-10) 
& = (mp' — fp) +(my'— y)?. 
Here s is still a disposable constant (s + m), but it will be conveni- 
ent to take it as the magnification associated with the pupil planes. 
Note that the relations between the old and the new rotational 
invariants are linear. Indeed, if we now exceptionally write £, 7, ¢ 


for the former, eer eed 


n = smé —(st+m)qn+&, (15.11) 
C= m= — 2m +, 


(s—m)Pé = E—29 +6, 
(s—m)?y = mE —(s+m)n+s6, (15.12) 
(s—m)?& = mE —25my +E. 
The ideal angle characteristic now takes the simple form 
Ty = (6), (15.13) 
in generic agreement with the paraxial limit (14.35). 
When recedes to infinity, the only resulting infinities occur in 
the function g(¢); but this is irrelevant to the displacement. Alter- 


natively, one may go over to a new anterior base-plane, taking this 
to be & now. Ideal imagery requires that 


y’ = {B/e, (15-14) 
the value of the constant of proportionality on the right being fixed 
by the paraxial limit. We therefore have 


Tf, fy as.) 


oy a 
and these may immediately be integrated with respect to /’ and 
y’ respectively. Taking rotational invariance into account it follows 


that Be pp. 

Ty = g(€)— f(x -— 6), (15.16) 
where g as usual denotes some otherwise undetermined function of 
its argument. In terms of é, 7, ¢, with m = 0, (15.16) reads 


Ty = g(6)+s-tfnt — 6), (15.17) 


or conversely, 
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where certain terms which depend on € alone have been absorbed 
in g(¢). 

Next we come to W.(y’, 2’, 8, y). As in the case of 7, we choose 
JF and ¥’ as base-planes. Perfect imagery requires that 


y = my = mow,,/ eB. (15.18) 

In terms of the appropriate invariants (cf. (13.5)) this implies the 
It 

oo Woo = (6) +m. (15.19) 


When the object recedes to infinity this breaks down completely; 
as was to be expected, since one then has the exceptional situation in 
which the use of W, is forbidden in principle. If we therefore go over 
to the new base-planes & and &’ we deduce, by an argument similar 
to that which led to (15.2), that (when m = 0) 


Wo = 8(6)— (4? +8 —2fn(t—Sy + fOr —f) Ht. (15.20) 
Finally, in the context of W,(f’, y’,y, 2) we again take % and #7’ 
as base-planes. ‘Therefore, in place of (15.18), 


my = y' = — OWao/ 2B", 
whence Way = 2(6) — my. (15.21) 


As in the case of V, it is best to introduce y, in place of y when the 
object happens to be at infinity. . 

To end this discussion of ideal characteristic functions we inquire 
briefly into the kind of condition one might impose upon K in 
order to fix the form of the function g which appears repeatedly 
above. Some limitation upon the aberrations associated with the 
pupil planes clearly suggests itself. Accordingly we seek to determine 
the function g(¢) which occurs in (15.6) so that all rays through E 
pass through E’. 

Let a ray through £ and £’ intersect ¥ in O. Then 


V(O, E’) = V(O, E)+V(E, E’). (15.22) 


In view of the assumed condition, V(E, E’) = const. =e, say. 


Purther P(E) = V0,0,9,2) =e) (@2+ 0), (15.23) 
from (15.6), whilst V(O, E) = (d?+¢/m). (15.24) 
(15.22)nowgives g(¢) = (d’2+¢)t+ (d?+¢/m?)t +e, (15-25) 


4 BIT 
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so that g(¢) is indeed fully determined. One could in turn define the 
‘ideal form’ g, of g to be that given by (15.25), and the extent to 
which any actual g(¢) differs from gy will be reflected in the extent 
to which rays through E will fail to pass also through E’. 


16. Aberration functions 


In this section we reconsider, in the special context of the symmetric 
system, much of the general work of Sections 7 and 11, and suitably 
adapt the notation. The actual characteristic function F of K will in 
general differ from the desired ideal Fo, and we write 

F=f,+f, (16.1) 
as before (see also the end of Section 36); an equation which defines 
f. Now recall that f can be expanded in ascending powers of the 
rotational invariants &, 7, ¢. This means that in (11.1) only terms of 
even degree can appear, i.e. f,, fs, f5.--- = 0. We therefore write 


f= af (16.2) 


in place of (11.1), where f™ is a homogeneous polynomial of degree 
nin &, 9, ¢. Evidently {™ is the aberration function of order 2n—1, 
whilst there are now no aberration functions of even order. As a 
matter of convenience, one therefore sometimes refers to the aber- 
rations of order 3, 5, 7, 9,... alternatively as primary, secondary, 
tertiary, quarternary, ... aberrations. This is not a good terminology, 
since the term ‘primary’ surely connotes dominance, so that in a 
general system the primary aberrations will be of the second order 
(cf. Section 88, in particular equation (88.16)). Note that in (16.2) 
there is no term f™ quadratic in the ray-coordinates. Such a term 
would relate to paraxial defects of the image, but we already know 
that when K is symmetric every object plane % has a definite 
conjugate image plane.¥’; and the paraxial imagery associated with 
JF and .¥’ is perfect. A term f™ can therefore arise only if, at some 
stage, one decides to consider a receiving plane other than the ideal 
image plane. 

Since formal expansions are now in three rather than in four 
variables, we set in place of (11.2) 


fmay ve as ad e. (36.3) 


p=0 p= 
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The number of constant coefficients in f™ is $(n+1)(n+2). One 
of these is, however, essentially redundant. This may be seen by 
contemplating V, for example. The displacement does not depend 
on the arbitrary function g(¢) which appears in (15.6), so that the 
terms of v which depend upon ¢ alone are irrelevant; indeed, one 
may think of them as absorbed in g(€). ‘The number of essential 
constants is therefore reduced to $7(n-+ 3). (See, however, the re- 
marks at the end of Section 17.) We thus see that the number 7 
of aberration coefficients of order n is given by 


i) (n odd) 


fe) (n even). 


(16.4) 


Accordingly one has 5, 9, 14, 20,... aberrations of orders 3, 5, 7, 
Qg, ..-. respectively, and none of even order. Of course, should it 
happen that K has some additional symmetry, these numbers will 
be further reduced. Also, (16.4) refers to a fixed position of %. 
If the imagery associated with arbitrarily situated (normal) con- 
jugate planes is to be described, one has to deal with one addi- 
tional coefficient in each odd order. Indeed, it is just that coefficient 
which was above regarded as redundant, specifically v™ in the 
case of V. 


17. The displacement 


Geometrically the quantity of principal interest is the displacement 
e’. In the context of V this is, in the usual notation, 


e! = Y’—my = y'-y,+d'B'/a’. (17.1) 


Proceeding from this as it stands, the constant @’ will occur later 
a very large number of times, and this is a nuisance. Accordingly 
we choose, in the present context, a new unit of length such that 
the numerical measure of d’ becomes unity. This convention is 
rather unorthodox when d’ happens to be negative, and then the 
phrase ‘unit of length’ is somewhat metaphorical. At any rate, 
every length is to be thought of as expressed as a multiple of d’, and 
an expression such as (d’2 + 2) is intended to mean d’(1 + u/d’2)t, the 
positive square root being understood. With this prescription one 
just does not worry about the sign of d’ in the course of formal work: 
one restores d’ explicitly by dimensional arguments at the very end 
4-2 
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of it, and then inserts its actual numerical value; see, for example, 
the remarks following equations (23.3) and (23.5). 
Now, recalling (15.6), 


V = g(€)-(atu)tt+y. (17.2) 


Here one does well to bear in mind that, strictly speaking, (17.2) 
operates as a definition of g+¥v, rather than of v alone; for the pre- 
vious definition of g has now become meaningless, since the con- 
jugate planes in question are im fact not perfect. Restoration of 
uniqueness requires a prescription as to how the terms of (17.2) 
which depend on € alone are to be distributed amongst g and v. One 
might require, for instance, that v(0,0,¢) = 0, or else that g(¢) 
depend linearly on €. 
From (17.2), 


B’ = eV/dy’ = —(1+u)4(y’—yi)+(2y'%+yiv,), (17.3) 


where differentiations with respect to £, 7, ¢ are indicated by the 
appropriate subscripts; e.g. vz = 0v/0§. After some manipulation 
(17.1) then becomes 


e’ = (y’—yi)(1-D) +(1 + uF (2ugy’ +e, y1)D, (17-4) 

where D = {1 +2(1 +u)8 [2(E—7) ue + (9 —$) 2] 
—(1-+u) (4802 + 4790,0, + Sen)}*. (17-5) 
The general relationship between e’ and vis therefore quite complex. 
With regard to €’ we retain the notation of Section 11, and write 
e = Den (1% = 3,557 +=)» (17.6) 

If v has no terms of order less than 2n—1, then 

Cana = 20f9 y+ 071, (17-7) 


so that these are the terms induced by the aberration function of 
order 2m — 1 in the displacement of the same order, even when terms 
of lower orders are present. 

Let us contrast the results just obtained with those one gets if one 
uses J'in place of V. In view of (15.13) we now have 


T = g(€)+4, (17.8) 
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the rotational invariants being those defined by (15.12). Now 


eee mi _ or at 
€ = My = — a maa: (17.9) 


Using (17.8), all terms involving g(¢) of course disappear and one 
is left simply with 


e! = (s—m) [2(8—s8’) tf, +(@—mB’)t,]. (17.10) 


This equation looks very much simpler than its counterpart 
(17.4). Alas, this simplicity is somewhat illusory. The reason for 
this situation is as follows. One is almost always interested in 
families of rays from points of the object. Any such point is defined 
by constant values of y and z (granted of course that % and @ 
coincide), i.e. the four coordinates which occur in (17.10) are 
constrained by the conditions 


oT /0B = y = const. (17.11) 


This means in effect that after the displacement has been written 
down according to (17.10), two of the four coordinates, or two 
suitable linear combinations of them, have to be eliminated in 
favour of y. This is, in general, a very tedious task; the difficulties 
being compounded by the need in practice to introduce coordinates 
in the exit pupil, say, at the same time (see also Section 23). In the 
case of V on the other hand, the tedious part of the work consists in 
expanding the factors multiplying y’ and y, in ascending powers of 
E, 7, ¢; but in this instance y and z are already amongst the ray- 
coordinates, and no further complications arise. ‘The sad story in 
this particular context is that in changing over from one charac- 
teristic function to another one may gain simplicity in one place, 
only to lose it in another. In investigations of a general kind, how- 
ever, elegance of the theory will often hinge on the appropriate 
choice of F. 

One final remark needs to be made concerning an incidental 
effect of the constraint (17.11). It is obvious that the left-hand 
member of this equation will contain the coefficients of those terms 
of T which multiply powers of ¢ alone. This means, in effect, that 
after the elimination described above has been carried out, the 
coefficient of order 2n—1 which was rejected on the grounds of 
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redundancy in Section 16, will in fact appear in the displacement, 
though only in terms of order exceeding 2n—1 (see, for example, 
equations (24.13)). The result (16.4) must therefore be appropriately 
interpreted. Bearing in mind the work of the end of Section 15, 
the process of elimination evidently ‘brings in implicitly the 
aberrations associated with the pupil planes. Note that this feature 
does not arise in the context of the point characteristic. One would, 
however, face an analogous situation there, if one eventually 
decided to introduce coordinates in & in place of those in &’ as is 
sometimes done in practice. The effects of such a change of coordi- 
nates may be far from trivial, for example in the case of photographic 
objectives covering very wide fields. 


18. Out-of-focus image planes 


Occasionally it is desirable to consider the image not in .¥’ but in 
some normal plane.%, the axial distance from the first to the second 
being ya’, that is to say x, since we have arranged d@’ to have the value 
unity. To define a sensible ‘displacement’ é’ in .%, we first have to 
define some point J, in this to serve as a counterpart to I’. A simple 
choice appears to be the point whose coordinates are (1+ x) y,. If 
a ray through O intersects ¥,, in the point Y’ it is natural to define 
€’ as é’ = Y’-(1+x)y,. (18.1) 
Then, referring to (17.1), 
e’—e' = x(B'/a’— yi) = x(e"—y’); 

whence é’ =(1+x)€'—xy’. (18.2) 
This result is remarkably simple, for the out-of focus displacement, 
reckoned relative to J, is simply the sum of the usual displacement, 
re-scaled by the factor 1+, and the linear term — xy’. Moreover, 
when a non-zero order can be ascribed to y, as is usually the case, 
the scale factor 1+, is to be ignored in considering the displace- 
ment of a given order generated by the aberration function of the 
same order. 


19. The third-order displacement 


We proceed to investigate the geometrical significance of the 
aberrations of the various orders, beginning with the terms of 
lowest non-vanishing order, i.e. the third. Formal considerations 
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aside, it does not matter which particular characteristic function 
we use for this purpose, as has been stressed repeatedly. On the 
whole we shall base our treatment on the point characteristic, except 
for occasional references to other characteristic functions. 

We write the third-order aberration function in the form 


v® = pf? + poby + pybC+ pay? + pss. (19.1) 


The term in ¢? alone has been omitted since it does not enter into 
the displacement, or else may be thought of as having been ab- 
sorbed in the function g(¢) of (15.6). The aberration coefficients 
Py ---»>Ps are constants of the system for the given positions of the 
various reference planes. They are, of course, the constants v®) 
of (16.3)—with f = v—in a simplified notation, 1.e. 


_ 2(2) _ (2 _ 22 
Pi =, po= Vv, ..., Pp = Ue. 


The third-order, or primary, displacement is now immediately 
obtained from (17.7), with = 2. Thus 


€3 = (4216 + 2P29 + 2P3S)y’ +(peEt+2payt+PsS)¥i- (19-2) 


When investigating rays from a fixed point O of ¥ one may take 
2, = 0 without loss of generality, in view of the symmetry of K. 
At the same time write y, = fh’, and introduce polar coordinates in 
Ooh y’ =pcosd, 2’ =psin#. (19.3) 
Evidently the family of rays grazing the rim of the stop are defined, 
at least approximately, by the constancy of p. At any rate, we now 
have 


E=p*, »=ph'cos@, €=h". (19.4) 
(19.2) then reads 
€by = 401 pcs 0 +p, p*h'(2+.c0s28) + (2ps+ 2p,) ph'2cos0 + ph’, 
€3, = 4P, p® sin 0 + p, p?h’ sin 20 + 2p, ph’? sin 0. 
(19.5) 
For historical reasons one often writes this in terms of five other 
constants 0, ..., 7 (the so-called ‘Seidel coefficients’), defined by 


01 = 4Py, FC, = Poy C3 = Pa, C14 = 2P3—Py T= Ps (19.6) 
(cf. the remarks following shortly after equations (41.3) and (45.13)). 
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Then 
€3y = 7p? cos 0 + o,p7h'(2 + cos 20) + (303+ 0,4) ph’ cos6 + oh’, 
63, = 0,p®sin0+ o, ph’ sin 20+ (03+ 04) ph'sin O. 

(19.7) 
The total displacement is made up of four distinct partial dis- 
placements, characterized by their dependence upon the various 
powers of p and h’; whilst one of them is governed jointly by two 
coefficients. It is usual to speak of each of the partial displacements 


as being an aberration of a certain type, and we proceed to examine 
them in turn. 


(i) Spherical aberration 
The terms of (19.7) governed by o, give rise to the partial displace- 
ment 9=0,p?cos6, 2 =0,p sind. (19.8) 
To avoid the constant repetition of a large number of affixes we 
are using the symbols ) and 2 for the components of partial dis- 
placements in the equations such as (19.8), whatever the receiving 
plane may happen to be. Indeed we may regard 2, 9, 2 as relating 
to a set of Cartesian axes having the usual orientation but with 
origin at J’ or I, as the case may be. Evidently a family of rays for 
which p = constant (henceforth called a zonal family) intersects ¥’ 
in a circle of radius o,p?, concentric with the ideal image point J’. 
Varying now p from 0 to its largest value, py say, we see at once that 
the image of O is a circular patch of light, centred on J’, and of 
radius 0, pj. This aberration is called (primary) spherical aberration. 
Its name is somewhat unfortunate, in as far as it suggests some 
connection with the presence of specifically spherical refracting or 
reflecting surfaces in the system. However, the suggestion obviously 
has no foundation in fact, and the terminology survives only by 
tradition. 

It is of advantage to consider an out-of-focus receiving plane F,. 
If vis taken to be of the second order—an assumption yet to be justi- 
fied—we have, in view of (18.2), 


J =(a,p?—xXp)cosO, 2 = (0,p?—xXp) sin. 
It suffices to take h’ = 0 since spherical aberration is independent 
of h’, and at the same time to restrict oneself to meridional rays, 
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0 = oor 7, in view of the symmetry of K. The smallest illuminated 
disk, the so-called disk of least confusion, will be found in that plane 


for which 9(Po) = 9(D), (19.9) 


where f is that value of p which maximizes |(p)|, i.e. 6? = 4/304. 
One then finds easily that 


[Pl = 400 X=201p3, J = 40r/%- (19.10) 


Notice that x is indeed of the second order, consistently with our 
previous assumption. The radius of the disk of least confusion is 


Fig. 4.2 


therefore only one-quarter of the radius of the image patch in 7’; 
and the mutual distance between these disks is three-quarters of the 
distance between I’ and the marginal focus, i.e. the point in which 
rays having p = Pp intersect the axis. We also note in passing that 
all rays touch the caustic surface 


4k° = 270;(9? + 8°). (19.11) 


It will not come amiss at this stage to issue the general reminder 
that the validity of all results relating to displacements depends of 
course on the extent to which the geometrical-optical limit does in 
fact represent any ‘reasonable’ sort of approximation at all. In 
practice this means that reliable quantitative information can be 
expected only when the aberrations are in some sense large. One 
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need only reflect that in the extreme case of the absence of all 
aberrations K should produce a point image J’. In actual fact, how- 
ever, it produces a small circular patch of light (the Airy disk) 
surrounded by concentric rings. (It is assumed that h’ is not too 
large.) Similarly, in the presence of a very small amount of primary 
spherical aberration alone, (19.10) does not in fact give the best 
position of the receiving plane. 


(ii) Coma 
The partial displacement of (19.7) which varies as ph’ is 
9 =a(2+c0s26), 2 = asin 26, (19.12) 


Fig. 4.3 


where a = o, ph’. A zonal family of rays again intersects .%’ in a 
circle, its radius being a. This circle has its centre at the point 
(2a, 0), i.e. itis not concentric with J’. Moreover it is described twice 
as the radius vector in the plane of the exit pupil sweeps out its 
aperture once. The family of circles obtained by varying p touches 
two straight lines inclined at an angle yy, = 60° to each other. The 
image is therefore a flare-shaped patch of light bounded by these 
lines and by the arc of a circle of radius o, pgh’ (Fig. 4.3). This 
particular aberration is called (primary) coma, more exactly circu- 
lar, or linear, coma, to distinguish it from other types which occur 
amongst the aberrations of higher order. The presence of coma is 
particularly objectionable on account of the asymmetric appearance 
of the image, and in practice one will aim to remove it as far as 
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possible. It cannot be improved by a different choice of receiving 
plane. 

For any fixed value of p tangenitial rays (0 = 0 or 77) and sagittal 
rays (9 = +47) meet in certain points along the #-axis. Their 
respective displacements relative to I’ are called tangential and 
sagittal coma respectively, and they will be denoted by x; and «,. 
Here x, = 3a, K, = a, so that, upon indicating the order explicitly, 


Kig/Ksg = 3- (19.13) 
(111) Curvature of field 
The terms of (19.7) governed jointly by o3 and o, are 
9 = (303+0,—k) ph’ cos6, 2 =(¢3+0,—k)ph'sin@, (19.14) 


this being the displacement in an out-of-focus plane 4; shifted 
relative to ¥’ by a second-order amount y = kh’, where k is some 
number. For general values of the constants involved in (19.14) 
a zonal family of rays intersects 4; in an ellipse. This degenerates 


into a straight line for each of the values 
kh, =30,;+0, and k,=o03+0, 


of k. These lines are called tangential and sagittal focal lines respec- 
tively, their mutual separation being 20°, ph’*. The ellipse becomes 
a circle of radius 0 ph’ in the plane defined by k = 203+ 0,. The 
focal lines are mutually at right angles: they are in fact nothing 
but the focal lines whose existence was already inferred in Section 
10. In the present context the earlier discussion amounts to taking 
not the conventional coordinate basis, but one whose X-axis and 
X-axis point along a principal ray, i.e. a ray through O and E’. 
(This definition is used throughout our work. Not infrequently 
principal rays are defined to be rays through O and the axial point 
of the stop; and one can run into apparent contradictions if one fails 
to bear in mind that in general these alternative definitions are 
inequivalent for all but paraxial rays.) 

The shape of the complete image, obtained by varying p from o 
to Pp, is elliptical, circular, or linear, as the case may be, and is 
in every case disposed symmetrically about J;. When o,=0 a 
plane object has a sharp image in a surface of revolution whose 


polar curvature is 
C= 204. (19.15) 
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Accordingly the aberration governed by o, is known as (primary) 
curvature of field, more specifically as Petzval curvature. Further, 
according to (19.14) all meridional rays pass through one point of 
the surface of revolution whose polar curvature is 


Cy = 2(3034+ Fy); (19.16) 


whilst all sagittal rays pass through one point of a surface of revolu- 
tion whose polar curvature is 


Cy = 2(F3+ 04); (19.17) 
so that, incidentally, C,—¢ = 3(¢,—C). (19.18) 


The three ancillary surfaces just defined of course touch #%’ at 
Oo. The first of these is often referred to as the Petzval surface; 
whilst the other two are the tangential and sagittal image surfaces 
respectively. The existence of these as distznct surfaces is guaranteed 
if only the coefficient a, does not vanish, and the defect of the 
imagery governed by o; is therefore known as (primary) astigmatic 
curvature of field, or simply astigmatism. 


(iv) Distortion 
The last of the partial displacements in (19.7) is simply 
g=oa,h", 2=0. (19.19) 


K therefore produces a sharp image in’, but it is distorted unless 
o;, =o. Accordingly the defect of the image governed by a; is 
known as (primary) distortion. Correctly to the present order, a 
segment of the straight line y = const. = k/m, say, in ¥ is trans- 
formed into an arc of the parabola 


Y' = Alr+0,(2+Z%)] (19.20) 


in ¥’. Cursory examination of the image of a square grid will 
reveal the reason for speaking of pin-cushion distortion and barrel 
distortion when o; > o and o; < 0 respectively. Note that this type 
of aberration is asymmetric, for the image lies to one side of J’. 
When all four kinds of displacements are present together the 
curve generated by the points of intersection of a zonal family of 
rays with ¥’ will be quite complicated. Nevertheless, certain fea- 
tures of the image can be isolated without much difficulty. For 
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instance, the asymmetry of the image relative to J’ is still governed 
by o, and o;. However, the distortion is really irrelevant to the 
extent that it does not contribute to the asymmetry of the illumin- 
ated patch as a whole. One may therefore define the tangential 


t “~ 
myn” y= AG(P,0) + HO, ~H0, 9) (19.21) 
and the sagittal asymmetry 
K, = H(p, 7/2) -90, 9), (19.22) 


as appropriate partial measures of the asymmetry of the image patch. 
In the present circumstances one finds of course that 


K, = Kizy K, = Kgs- 
More usefully one might define a mean asymmetry 
Ky, = 2(K,+ K,). (19.23) 
Then the vanishing of K,, is a necessary, though not a sufficient, 
condition for the symmetry of the image patch. 

We conclude this section by showing briefly how the same results 
are arrived at when the work is based on the use of T rather than of 
V. We write generically 

t) = (s—m) (p18? + paby t+ PsbS + Pat +Ps7S), (19.24) 


where the factor (s—m)~1 has been inserted for convenience. The 
rotational invariants are those given by (15.10). (17.10) gives 
straight away 


€; = (4216 + 2p29 + 2p3) (B —sP’) 
+ (PoE + 2p49 +p56)(B—mB’). (19.25) 


Now in view of (14.35) we have here 


T = ay+(f/2m)b+O(4), (19.26) 
whence, by differentiation, 
Yi = [(B —mB") + O(3). (19.27) 


(14.35) can be thought of as relating to the pupil planes, provided 
m be replaced by s, so that 


Ye = {(B—sB’) + O(3). (19.28) 
Just as it was previously convenient to prevent the continued 
appearance of the constant d’ by choosing a suitable unit of length, 
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so we can now choose a unit of length such that f = 1, recalling that 
f cannot be zero here. Correctly to the Present order we therefore 
have to make the substitutions 


b—mp' = hr, y—my' = 9, \ 
B-sB’ =pcos#, y—sy’ =psind 
in (19.25). Since (19.4) formally also applies here now, (19.25) will 
at once be seen to yield (19.5) exactly; and thereafter everything 
follows as before. In practice, however, one must be careful to 


bear in mind the conventions regarding the different choices of 
units of length which have hitherto been made. 


(19.29) 


20. The fifth-order displacement in the absence of 
third-order aberrations 


The considerations of the preceding section will now be extended to 
aberrations of order five, supposing those of the third order to be 
absent. Equivalently, we wish to study that part of the fifth-order 
displacement which is generated by: the fifth-order aberration 
function v®), In analogy with (19.1) we write 


V®) = $5 £3 + 59079 + 545°C + 54h + 55ENC + SeEC? 

+ $79? + 5397S +5502. (20.1) 
The quantity of interest is the displacement e;(v), the notation 
corresponding to that introduced shortly after equation (11.6). 


However, we naturally write simply ¥ for it, since we are, after all, 
not considering the total displacement. ‘Thus, again from (17.7), 


= (65,6? + 45951 + 4566 + 2549? + 25596 + 2560") y" 

+ (5962 + 25,69 + gE + 35792 + 259+ 596%) yi, (20.2) 
from which one gets the fifth-order equations corresponding to 
(19.5): | 
9 = 6s, p* cos 0 + 25,p%h'($ + cos 20) + 2p*h’?| (253+ 5,) +5, cos? 6] 

| x cos 0 + p'h'3[(a55-+ B51) + (5 +$5,) cos 20] 
+ 2(sg+5,) ph'4 cosO+s)h", 
2 = 6s,p*sin 6 + 2s,p%h’ sin 20 + 2p*h’*(2s, + 54 cos*@) sin 8 
+5,p7h’ sin 20+ 2sgph'*sin@. (20.3) 
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Of the six types of partial displacement, three are now each governed 
jointly by two distinct aberration coefficients. We proceed to 
investigate the various types in turn. For this purpose it is appro- 
priate to pretend that the primary aberrations are in fact absent. 


(i) Spherical aberration 
The first partial displacement 
9 = 6s,p> cos, 2 = 6s,p> sind (20.4) 


is an obvious fifth-order counterpart to (19.3), and it is accordingly 
called secondary spherical aberration. The image path is circular, 
and centred on J’. There is again a disk of least confusion, located in 
a certain plane between .%’ and the plane containing the marginal 
focus. See also Section 21. 


(ii) Circular coma 
The terms of (20.3) governed by s, are 

9 =a(3+c0s20), 2 =asin26, (20.5) 
where now a = 2s,p*h’. The resemblance to (19.12) is obvious, 
and this aberration is accordingly called secondary circular coma. 
The circle of intersection points of a zonal family of rays has radius 
a, whilst its centre is at (a, 0). The whole family of such circles 
touches a pair of straight lines, their mutual inclination being 
ws, = 2arcsing ~ 84°. The general appearance of the image thus 
closely resembles that of Fig. 4.3, if due allowance be made for the 
increased angle between the asymptotes. Also, one now has 


Kis = 3a, Kos = 30, 
so that Kjs/Kgx = 5- (20.6) 


(iii) Oblique spherical aberration 


The partial displacement varying as p*h’* is conveniently written 
cs 9 =a(k+1+ co0s"@)cos0, 2=a(k+cos?6@)sin@, (20.7) 
with a = 2s, p*h’* and k = 2s,/s,. It has no third-order counterpart, 
and it represents (secondary) oblique spherical aberration. This 
terminology presumably arose from the fact that this aberration 
varies with p in the same way as does primary spherical aberration, 
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but does not vanish on the axis. Also, when s, happens to be zero, 
one does have an effect resembling primary spherical aberration for 
any given value of h’. 

The usual curves are drawn in Fig. 4.4 for a few typical values of 
k, a having been given the nominal value unity. The scale used is 


® 


indicated in each case along the axes. In general terms, one has an 
oval for sufficiently large values of k. As k decreases this oval be- 
comes more and more elongated along the f-axis, until (for 
o <k < 2) it presents a pinched-in appearance. When —1 <k <0 
the curve intersects itself at points of the 9-axis. The only curve 
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which is symmetric about both axes arises when k = —1, and then 

pt 4 88 = 1, (20.8) 
Finally, for values of k less than — 1 one essentially goes through the 
same sequence of curves again, in the sense that the curve belonging 
to the value —k is the same as the curve belonging to the value 
k—2, but turned through go° about the origin. 

The full image patch results from letting p vary through its full 
range. In all cases it is symmetric with respect to I’, its general 
appearance being that of the generating curves, except when 
—1<k<o. In the exceptional case s, = 0 the image is exactly 
circular. 


(iv) Elkiptical coma 


We come now to another new type of aberration, the partial dis- 
placement varying as p*h’s. Write 


9 =a(k+1+kcos20), % =asin26, (20.9) 


where a =5,p7h'? and k = 1+ 35,/25,. (20.9) is reminiscent of 
(20.5), though here we have to deal with an additional parameter k. 
The usual curve generated by a set of zonal rays is an ellipse: 


[9 —(k-+ 1) a]2-+ R28? = Ra?, (20.10) 


and this is described twice as 0 goes from 0 to 27. Upon varying p, 
all these ellipses will be found to touch a pair of straight lines through 
I', provided the condition k > —} is satisfied. The angle between 
the tangents is then 2 arccot (2k+1)#. Under these circumstances 
the image patch will be bounded by these tangents and the arc of 
an ellipse, and will therefore have an appearance rather similar to 
that which results from the presence of circular coma. For this 
reason the aberration under discussion is known as (secondary) 
elliptical coma. When k > — 3 the image lies wholly to one side of I’. 
In particular, when k = o, § no longer depends upon @, so that the 
image patch will then be triangular in shape, with a right angle at 
I’, whilst when s, vanishes it reduces to a part of the #-axis. The 
image is always situated asymmetrically with respect to J’, save in 
the exceptional case k = —1; for then it is exactly a circle, centred 
on J’. Consistently with this result, (19.23) just gives K,,=0 
when k = —1. 


5 BIT 
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(v) Curvature of field 


The aberration governed jointly by sg and sg is best considered in 
an out-of-focus plane %,. We now take y = kh’4, where k is some 
constant, and then 


d = [2(sgtse) A] ph’'*cos0, 2 =(2sg—k)ph'*sin@. (20.11) 


This is entirely analogous to (19.14), for which reason this defect 
of the imagery is called (secondary) curvature of field. 'The tangential 
and sagittal focal lines will be found in the planes defined by 


ky, = 2(sgt+5g), Rs = 28g, (20.12) 


respectively. When s, = o these lines coalesce into a single point, so 
that an object in % then has a sharp image lying in a surface of revo- 
lution which, to the present order, has the equation 


R= 1+256(y'2+37)2. (20.13) 


(Recall that the origin is at E’.) Evidently s, is properly called the 
coefficient of (secondary) astigmatism; and provided sg = 0, S¢ 
is the secondary analogue of the coefficient of Petzval curvature. 
(See also the end of Section 51.) 


(vi) Distortion 


The sole remaining terms of (20.3) are represented by 
J=sh", Z=O0. (20.14) 


This is entirely analogous to (19.19), and this defect is called 
(secondary) distortion. It is not necessary to go into detail, as any 
discussion would be little more than a repetition of what was 
already said in the context of primary distortion. 

We could now go on to discuss the tertiary displacement induced 
by the tertiary aberration function along exactly similar lines. How- 
ever, little is to be gained by restricting ourselves to this particular 
order, and we proceed directly to the aberrations of order 2n—1 


(any 7). 


THE SYMMETRIC SYSTEM 67 


21. The displacement of any order in the absence of lower 
orders 


Assume that v = 0 for s = 1,2,...,2—1, so that we can examine 
the displacement €;,, generated by uv. Recall from Section 16 
that 

w= Lowernyely (w+) (21.1) 

4=0r=0 

the term v = n having been omitted since, as we know, it does not 
contribute anything to e’, as far as the present conjugate planes are 
concerned. It is very convenient to make the substitutions (19.3) and 
(19.4) already at this stage. Since we need not differentiate with 
respect to yj, A’ is just a fixed constant, whilst p and @ take the place 
of y’. Then (21.1) becomes at once 


2n-1 ( 2n-1 a) 
=—_ t 
on = yop Ath =>) or, (21.2) 
a=0 A=0 
say, where 6” = Sv, , cos 9. (21.3) 
v 


The sum over v goes from o or A—n, according as A—n < 0 or 
A—n 2 0, to $ if A is even or }(A—1) if it is odd. (21.2) exhibits 
vu as the sum cof 2n terms, each of which just corresponds to one of 
the 2 types of aberrations of we 2n —1.'The partial displacement 
(of order 2m — 1) generated by v is given by 
d= (cosa Fe) vw, @ = (sind pte 6) on; 
(21.4) 
and we proceed to examine in this section various cases, correspond- 
ing to selected values of A,in turn. 


(i) A=o 
We have UL) == YM) p2n, (21.5) 
so that, since this aberration is independent of h’, it suffices to 
restrict oneself to meridional rays. Then 

J = anvyy p+ — xp (21.6) 
in an out-of-focus plane .%,. The displacement (21.6) is clearly the 


generalization to all orders of that given by (19.8) and (20.4), and 
5-2 
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so we have here spherical aberration of order 2n—1. (21.6) may be 
used to obtain the position and radius of the disk of least confusion. 
The work involved is elementary, and since the problem is rather 
academic we merely quote the result. The radius of the disk of 
least confusion is smaller than that of the image in ¥’ by a fraction 
2(n—1)c?”—1, where c is the real root of the equation 


 (-16+ I)c% =0. (21.7) 


Note that the image is of course disposed symmetrically about I’. 
It should perhaps be remarked that even when image patches cor- 
responding to different values of m happen to have the same radii, 
the actual distribution of light within them will, of course, depend 
upon 7. 


(ii) A=1 
Again we have only a single term, i.e. 
ve = vt) o2"—1h'cos 0. (21.8) 


The partial displacement obtained from this by means of (21.4) is 
9 =al[n/(n—1) +cos26], 2 =asin20, (21.9) 
with a = (n—1) vi?) p?”-°h’. This is clearly a generalization of (19.12) 
and (20.5), so that (21.9) represents circular coma of order 2n—1. 
One has the familiar family of circles, touched by two straight lines 


through J’, their mutual inclination being ,,_, = 2 arcsin (I — 1/7). 


Also 
Ky on—1/Ks, an-1 = 2N—T, (21 ‘ 10) 


so that the value of this ratio is just the order of the aberration 
being considered. The image lies entirely to one side of I’, as before. 
Note that when spherical aberration has been removed, circular 
coma is the dominant aberration for systems of small field angle, such 
as microscope objectives or astronomical telescopes. 


(iii) A= 2 
We now have to deal with two terms, i.e. 

vy) = p*-2h'2(y) cos?6 + vo). (21.11) 
The corresponding displacement is 


9 =a[k+1/(n—2)+cos?6]cos™, 2 = a(k+cos*@)sin@, (21.12) 
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where 
@ = 2(n—2) v)p2-8h'2, = (0 —1) vo /(n—2) of). 

This evidently generalizes (20.7). That it also generalizes (19.14) 
is fortuitous, in as far as it hinges on the vanishing of a when n 
happens to have the value 2. (21.12) thus represents oblique spherical 
aberration of order 2n—1. (Note that this terminology now appears 
even less well motivated than before, as the displacement does not 
even vary as the cube of p when n > 3.) The general trend of the 
usual curves generated by a zonal family of rays is not essentially 
dissimilar from that already encountered when m = 3. Here, what- 
ever the curve for a certain value of k may be, the same curve, 
but turned through go°, obtains for —[k+(n—1)/(n—2)]. The 
previous critical value k = —1 therefore now becomes 


—3("—1)/(n—2); 
but the critical curve itself is somewhat more complicated than 
before. 


(iv) A = 2n-3 
Again we have two terms, namely 
ve) _3 = p®h’23(y™ _ 5 cos?O+U™ 1 n-2C08O). (21.13) 


The displacement is given exactly by (20.9), provided one now takes 
a and k to be given by 


0 = ODay a0, R= 1+ 300, 9200 a wo 
This aberration is therefore elliptical coma of order 2n—1; and its 
description is fully covered by that given previously in the special 
case 1 = 3. 
(v) A=a2n-2 
Now OM) _9 = prh'™" (UM), _9.cos?O+ 04 n-1)s (21.14) 
whence, in an out-of-focus plane given by y = kh’2”-2, 

9 = ph’”[2(0™, 2 + om 1 n-1)—h] cos, 

= seas —k)sin6. (21.15) 


Here we are evidently dealing with curvature of field of order 
2n—1. ‘he condition for the formation of a sharp image lying in a 
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certain surface of revolution is that v,,_ 2 should vanish. When it 
does not, one again has two focal lines, so that this coefficient governs 
astigmatism of order 2n —1. 


(vi) A= an—1 
Here we have simply 

vf_, = ph’2"-1y™,_, cos 8, (21.16) 
which implies that 9 = v™, ,h?"-1, 2=0. _ (21.17) 


Weare evidently dealing with distortion of order 2n— 1, any discus- 
sion of which is surely superfluous. 

All third- and fifth-order aberrations previously encountered 
have now been generalized to order 22 —1, and we could go on to 
consider in detail the new types which arise as we go on to the 
seventh and higher orders. Such a specific discussion, however, 
becomes increasingly academic, and we shall therefore deal ex- 
plicitly with only one further aberration. As we go from fifth to 
seventh order two new types appear, of which one is governed 
jointly by two coefficients, the other by three. We therefore con- 
template the former, which, upon generalizing to any order, is 


vy) = p-3h'3(v™ cos? 6 + vm cos 6). (21.18) 
Then 
J= a (=) =") + (2229) 4.4) cos20 +008 49], 
n-3% ‘ n—-2 n—3 
2 = a[(2+)sin20+sin 46], (21.19) 
where = 1(n—3) 0 phan 
and k = 4(n—2) of /(n—3) 0. 


The usual zonal curves may have a great variety of shapes, and it is 
hardly worth while enumerating them all. In certain cases one again 
gets a comatic flare closely resembling that familiar from elliptical 
coma. It is also possible to get a set of figures-of-eight, such that one 
set of loops is touched by a pair of straight lines through J’. ‘The 
image is, however, evidently not wholly contained within these 
tangents; nor does it lie entirely to one side of J’. 

With this example we conclude our enumeration of particular 
aberrations, and go on to a general classification and terminology. 
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22. General classification of aberrations 


For the purpose of discussing the various aberrations which con- 
tribute to the displacement of order 2m — 1, we continue to suppose 
that the aberrations of order less than 2n—1 are absent. From 


(21.2) we know that o@) = pare, (22.1) 


where © is a polynomial in cos@, containing only even and odd 
powers of this, according as A is even or odd. We note in passing 
that in proceeding from order 2n—1 to order 2n+ 1 two new types 
of aberrations appear, and in view of (16.4) these are governed by 
n+ 2 coefficients altogether. This result alone shows that the general 
zonal curves corresponding to the various types become increasingly 
complicated, and for sufficiently large n it is neither feasible nor 
useful to enumerate them in detail; so that a characterization of a 
general kind must suffice (see also the end of Section 40). 

Consider therefore the displacement generated by (22.1). It 
has the generic form 


$= qeosd O,, 2=qsindO, (¢=p* 7h"), (22.2) 


where 0), and @, are polynomials in cos 0, again containing only even 
and odd powers of this according as A is even or odd. When, and only 
when, A is odd (22.2) can be written in the form 


SY _ q a a, COs 2s0, & = q> b,sin 2s0, (22.3) 
s=0 s=1 


where the a, and b, are certain constants. In this case the usual 
zonal curves are described twice every time the radius vector in the 
exit pupil sweeps the latter out once. As p increases the various 
curves remain geometrically similar to one another; but, ignoring 
the over-all change of dimension, the curves suffer an increasing 
bodily shift, on account of the presence of the term qa, in 9. If the 
various curves do not contain I’ there will be a pair of fixed tangents 
touching all of them, and such tangents may also exist when the 
curves have loops which do not contain J’. The image is in general 
asymmetric, i.e. it has no line of symmetry parallel to the 2-axis. 
It is clear that circular coma is essentially characteristic of the 
aberrations which have just been described, and for this reason they 
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are called comatic. Every comatic displacement varies as an odd 
power of h’ and an even power of p. 

When A is even, # and & are also periodic in 0, but the period is 
now 27. Moreover, every zonal curve is now symmetric about the 
%-axis and is centred on J’. (Of course ail zonal curves are always 
symmetric about the #-axis.) In any order, astigmatism is amongst 
these symmetrical aberrations, and, at least as a matter of con- 
venience, these may therefore be called astigmatic. Every astigmatic 
displacement varies as an even power of h’ and an odd power of p. 

As one proceeds to higher and higher orders it becomes in- 
creasingly futile to invent ever new names for the new aberrations 
which make their appearance. It is preferable to use a terminology 
which directly reflects the results embodied in the present discussion. 
Accordingly, we call the aberration (i.e. the displacement) varying 
as p2"—A-1h’A coma of degree A and order 2n—1 when A is odd, and 
astigmatism of degree 2n—2X—1 and order 2n—1 when A is even. 
Then, for example, secondary oblique spherical aberration becomes 
secondary cubic astigmatism; circular coma (of any order) becomes 
linear coma; whilst, when n = 4 (21.19) represents tertiary cubic 
coma. Of course, in special cases one will retain the traditional 
names, and will continue to refer to spherical aberration rather than 
to astigmatism of degree 2n—1, granted that the order is 2n —1 in 
each case. 

It should be borne in mind that the preceding classification is 
by no means only of academic interest, as might appear to be the 
case at first sight. The point is that although we initially supposed 
that aberrations of order less than 2” —1 were absent, the generic 
result (22.3) remains valid when this restriction is removed. The 
presence of lower orders merely affects the values of the constants 
a, and b,, but these were not further specified in any case. At any 
rate, the time has plainly come to consider the combined effects of 
aberrations of several orders, present simultaneously. 


23. Effective aberration coefficients (v™) given) 


Given the aberration function v, the displacement e’ is given by 
two series, each of which is a sequence of homogeneous polynomials 
of degree 3, 5, 7, ... in the ray-coordinates y’, y,. The polynomials of 
degree 2m — 1 have coefficients which are certain combinations of the 
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characteristic aberration coefficients which enter into v®, ..., vo, 


Write €in-4 = ys >> (ae + Miya) orem e’. (23-1) 


n=0 y=0 
Then the u®, a are the new coefficients under discussion, and we 
call them effective aberration coefficients. (See also the discussion 
following shortly after equation (69.5).) They are defined by (23.1), 
so that even if one started with the angle characteristic, for instance, 
one must end up with the same series (23.1), provided that at some 
stage one reverts to the variables y’, y,. Of course, one could define 
different effective coefficients through series exactly like (23.1) 
except that the variables y’ are replaced by coordinates in the 
entrance pupil or elsewhere; see for example Section 37, after 
equation (37.7). At any rate, we shall adopt (23.1). Our task is to 
relate the wu), to the characteristic coefficients v7), or 1, ..., as 
the case may be. 

The desired end may be achieved straightforwardly enough, but 
the task becomes exceedingly laborious for orders greater than 5, 
certainly for orders greater than 7. It may be made a good deal easier 
in the following way. One first represents certain groups of terms in 
(17.5) by some collective symbol, e.g. one might write A and B for 
the factors of 2(1-+u)* and (1+), and C and E, say, for v, and »,. 
Then A = A,+A;+A,+..., where A, is a homogeneous poly- 
nomial of degree s in the variables &, 7, 6; and likewise B, C and E 
are series beginning with terms B,, C, and E, respectively, whilst 
D=1+D,+.... Proceeding in this way, it is reasonably easy to 
keep an eye on the terms of any specific degree, and one is less likely 
to miss some of them inadvertently. In particular, the quadratic 
terms of the factors multiplying y’ and y, in (17.4) are simply 


2C,+A,+uC, and E,—A,+4uk, 
respectively. Similarly, the cubic terms are 
2C,+ 4uA,—4$B,—2A,C,+ Ag +uC,—4v7C, 
and E;—4uA,+3B,—A,E,—A,+4uk,—}wc,, 


respectively, and these are the terms giving the total seventh-order 
displacement. From the results just given it is a routine matter to 


obtain the coefficients u™,u™ forn = 2,3, 4, ..., the case m = 2 being 
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somewhat trivial. We shall content ourselves with the explicit 
results for orders 3, 5 and 7. In an obvious notation we write in 
these three cases pi, D7; (@ = 1,2, 3), Sz5¢ (“= I,...,6) and tf, t7 
(4 = 1,...,10) respectively for the effective coefficients. ‘Then, 
when n = 2, ‘ oe 

Pi = 4Pp Pr = Pa 

PE=2P» Pa = 2Py (23.2) 

P3 = 2Ps, BS = Ps 


Whenz = 3 
st = 68,4 6f,, 5 = Sg—4P1+ bay 
S} = 45.—8P, + 4Po, Sf = 254+ 4P|—4P2.+ Py 
S} = 453+ 2p, — pot 3Ps) 53 = 53+ 3P2—2pst dPs, (23.3) 
St = 25,—4pot 24, SE = 387+ 2Po—4Pay 
Sf = 255+ P2—4Ps—2Pat Ps, Sf = 25g +2Pyt+3Ps—2Pss 
SG = 25g+Ps—Psy 5G = Sot $Ds. 


Before going to m = 4 a word must be said about the effects of 
dropping the normalization condition d’ = 1. The effective and 
the characteristic aberration coefficients of order 2n—1 have 
dimensions (length)-?"+? and (length)-?"+? respectively; granted 
that v is defined as hitherto, i.e. without an additional scale factor. 
Evidently, therefore, every primary characteristic coefficient takes 
a factor d’ in (23.2), but a factor 1/d’ in (23.3); whilst in the latter 
S,(% = 1,...,9) takes a factor d’. In short, removal of the condition 
d' = 1 entails modifications of (23.2) and (23.3) which may be re- 
presented schematically by 


pe=dp, s*=ad's+pid’. (23-4) 
When n = 4 


tT = 8t,+95,+ 3), —24pi, 
tF = Oty— 128, +75,—4P1+ 2 Pot 16Pi— 32p Po, 
t§ = Oty + 35, —S2+ 65g +P1— bP + 2Ps+ 4PiPa~ 24PrPs— dP 


tf = 4ty— 85,4 554—2P,—3PotPat 16P1p,— 16p,p4 — 10P3, 
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tS = 4ts + 25, — 855 — 254+ 455+ 2P,—2P3— Pat dps t+ 16Pi ps 
+ 8p P4— 8p, ps5 + 2p3— 16pop3— 2PaPa, 
tg = 4tg t+ 283 — 55+ 35g—3P1— 3P2+dPs— 3P5t 4PiPst 2PoPs 
—PoPs — Opi; 
ty = 2t7— 454+ 357+ Po—2Pat 43 —B8popa, 
ts = 2tg+S4—455— 357+ 253+ Ps + 3Pa—Ps + 8popst 4PoPa 
—4Pops—8PsPq4— 2P i, 
t} = 2ty +55 — 45g — 253 +Sy—hPo—Pat$Ps+ 2Pohs +403 + 4PsPa 
~4PsPs— 2PaPs, 


tty = 2ty9+5g—Syg—4P3—4)5+ 2Psps — 42, 


Ef = t,—6s, + 48,— 2p, —hp.+ 8p? —4PiPo, 

tg = 2t, + 65; — 65, + 54+ OP, — po— hPa t+ 16D Pa — 8P1 Ps — 303, 

ts = ts + $5. — 453+ 355 — 2p, + tPa—Ps— Pst 8PiPs— 2Pops 
—4PPst+3P3, 

tf = 3ty + 45, — 654+ $57 —4D1 + EPo t+ 8p, Pat OP} — 16popu, 

if = 2tg— 4S, +453 + 384 — 255 + Sgt 2D, —2P.+ 3p3t 44+ 8P1Ps 
+ 8pops+ OpoPs—4PoPs— 4PsPa 

bg = ty + $55 — 254+ 359+ BP 2 —Ps+4Ps5—3PoPs + 203 — 2PaPs, 

b= 4tyy + 254 — OS, — 2Po+ Pat Bpopa— 402, 

ts = 3ly. + 255 +357 — 45g + Po — 2P3— 2Pat bPs + 4PoPs + SpsPa 


—4Pyps+ 6f3, 
Eg = 2tyg+ 256+ 35g — 259 +p3t+$P4—Pst+ 4PaPs + Opsps — pi, 
tip = ty + B59 + 8p, + Sp3. (23.5) 


The analogue of (23.4) is here 
t* = d't+s/d'+p/d’>+ pp. (23.6) 


The labour involved in obtaining the secondary, and even the 
tertiary, effective coefficients has after all been quite moderate. 
It might be remarked that one can also use an iterative method which 
allows one to proceed even more easily than by that used above. 
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, 


For this purpose one writes 1/a’ in a form which contains € 
explicitly (see (35.6)) and inserts this in (17.1). However, the effec- 
tive coefficients of any order then require one to know only those of 
lower order, so that the various equations above can be obtained 
iteratively; cf. the discussion preceding equation (35.12), At any 
rate, in the case of the angle characteristic the situation is worse, 
a state of affairs which was already alluded to after equation (17.10). 
In fact, it is so much worse that one is undoubtedly better off with 
V than with T as long as one wants to consider the exact displace- 
ments of the various orders. 

We conclude this section with the somewhat obvious remark 
that between the m(z +1) effective coefficients of order 2 —1 there 
must exist 4n(n —1) identities, in the sense that there are 4n(n —1) 
distinct linear combinations of them which express themselves 
entirely in terms of the coefficients of order less than 2n — 1. These 
identities merely express the requirement that the integrability 
condition on /’dy’+/y'dz' must be satisfied (cf. equation (6.7)). 
In principle they may therefore be obtained directly from the 


eee ap" ax! = dy'/ay’, (23.7) 


where ’ is to be obtained from (17.1); see (35.6) and (35.7). In 
the primary domain (23.7) reduces to 


de}, 02" = de) by’, (23.8) 


and this immediately gives 2p# — p¥ = o. 


24. Effective coefficients (”) given) 


As an alternative to (23.3) the secondary effective coefficients are 
now to be related to the characteristic coefficients 2% and ¢@). We 
may proceed as follows. As in (19.24) we write 


t = (s—m)*[(p EP +... + PeS*) + (SSF +... +5969) +---]- (24-1) 


The terms depending on € alone must here be included, so that t 
is the angle characteristic after the paraxial term has been removed. 
We now use variables o,, o,(= 6) and 4,, “,(= p) defined by 


B—sR’ = 8, GB —mp’ =», (24.2) 
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so that £ = o7+0°, etc. Then 


y = (2s7-+mT,) o+(sT,+2mT,) p, 


Y, = (2m7; + mT,) o + (mT, + 2mT,) p, (24.3) 
whilst (17.10) becomes 
e’ = (s—m)(2t,0+1t,p). (24.4) 
Also, let y, be coordinates in the exit pupil, i.e. 
Ye = (y’—4'B')/a’. (24-5) 


(Note that since we have arranged f to have the value unity, we 
have to carry d’ = s—m along here.) Our task is to eliminate o and 
vin favour of y, and yy. 

Now, using the first member of (15.12), 


t/a’ = [1 —(s—m)*(§—29 + £)]}# 
= 1+3(s—m) (E —24 +) + O(4). 
Also, from (24.2), B’ = (s—m)-!(u— oe), whence 
—d''|a! =(a—p) [1+ Hs—m)*(E—29 +] +0(5). (24-6) 
One therefore has, on rejecting terms of degree higher than 3, 
Ye = {1 + [2sty? + mt} + 3(s—m)* (E—2y + f)]}} 9 
+ [st + amd? — 3(s—m)-*(E—29 + Oe, 
Yi = (2mt? + mt?) o + [1 + (mt? + 2mt?)] w. (24.7) 
We write these temporarily in the abbreviated form 
yYe=(1+P)o+Qp, y,=Rot+(1+S)p. (24.8) 


Bearing in mind that we intend to proceed only to the fifth order, 
we need to include only terms up to the third order in the series 
which are the inversion of (24.8). Sinceo = y, + O(3),u = y, + O(3), 
the quantities £, 7, € which occur in P, Q, R and S are, to the 
required order, simply the same as the invariants £, 7, € which 
appeared in the context of the point characteristic; and so here 


o=(1—P)y.—Qy,, w=—Ry.+(1-S)y,. (24.9) 
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However, from (24.1) and (24.4), 


e = [(4p16 + 2f29 + 2p36) + (6518? + 4509 + 45gEC + 2549? + 25596 
+ 25567)] 6+ [(Po§ + 2h47 + Ps S) + (52.87 + 25489 + S586 + 3579" 
+25y9C+550)] b. (24.10) 
Continuing to reject terms of degree greater than 5, the substitu- 
tions (24.9) will leave the quintic terms, taken by themselves, un- 
affected, but they will produce a large number of additional terms 


arising from the cubic members. Denoting these additional terms 
by Ae’ one thus has 


— Ae’ = {4p,(3PE + 207) + pal3RE + 2(2P +S) 4 +206] 
+ 2p,[2Ry + (P+ 2S)o]+2p,Ry +s Rely’ 
+{4P1 OE + pol(2P+S)£+ 407] + 2psO6 
+ 2p, [RE +(P+2S) 9+ QC] +p,(2Ry+3S6)} yi. (24-11) 
Here, inturn, 
(s—m) P = (4sp,+ mp, +a) &+2(sp.+mp,—a)y 
+(2sp3+ mp, +a) €, 
(s—m) Q = (sp, + 2mp,—a)§ + 2(spy+ mp; +a) 
+(sps+4mpe—a)é, veo) 
(s—m) R = m(4p, +p.) + 2m(po+ Ps) 4 +m(2p3tDps) &, 
(s—m) S = m(pyp+ 2ps)& + 2m( py + Ps) 9 +m(ps+ 4P¢) 


where a = }(s—m)—. (24.12) must now be substituted in (24.11), 
and the subsequent expression suitably rearranged. ‘This is a most 
tedious process, the final outcome of which is embodied in the 
following set of equations giving the secondary effective in terms 
of the characteristic coefficients. (The primary relations (23.2) apply 
here also.) 


sf = 65, — 2a(48spi t+ 24mp,p,+ 3mp2 + 124p,), 
S} = 452 —8al[125p, p, + 8mp ps + Bmp, p, + 3mpz + 2mpeps 
+ 2mp»p,— a(8p —Po)], 


S$ = 453 —2a[32SP Py + 16mp, pz + 25P3 + lOMpypz+ Mops 
+ 8mp§ + a(12p, — 2p. + 2ps)), 
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Sq = 25 —2a[16sp p+ 16mp, ps + 85p3 + 14mp,p,+ 6mp 2p, 
+ 7Mpops,+ 8mpsp,+ 4mpi + 8a(2p;—Po)], 


S§ = 25, —8al2sp, p, + 8p, Pe + 45PaP3+ Post 3MPops+ 2MPape 
+ 2mp}+7mpsp,+ 5mpsp; + mpsPs— 24(P1 —Po+Ps)], 


= 25, — 2a[25pop; + 8mpop, + 45p3 + 8mpsp, + 16mpsp, 
-+ mp3 — 2a(p2—ps)], 


sf = s,—2a[125p, po + 8mp,p, + 16mp,p,+ 3mp3+ 2p oP 
+ 4mp,p,~ 2a(2p; — Ps) I, 


5 = 25,—4a[8sp,p, + 8mp,p; + 45p3 + 4mpop,+ 1OMP.p, + 2MP2 ps; 
+ 4mpsp,+ 4mpi+ a(4p,—4PotPa)l, 


53 = 85— 4028p); + 8mp Pet 35PaPst SpaPat 3MPoPs+ 2MPz2Py 
+ 2mp3 + Ompsp.+ 3MpsP5 + 2MpsPs 
— (2p — Pot Pst Pa)], 


54 = 357 —8a[3spop4t 3mpoPs + 3p? + 3mpsPs + a(2p.—P,)], 


53 = 25g — 4a[25pops + 8mpopet 45PsP4+ 4mMPsp, + 2spi + 8mp,ps 
+ 8mp,p_+ 4mpz — a(2p.—2p3— 3P,4)], 


5g = Sg—2a[2Spgp5 + 8mpsP_ + 25p4P5 + 8mpyp, + 3mp2 
+ 12mpsp_—2a(ps+ py]. (24.13) 


Here the sixth primary coefficient p, appears explicitly, as was to be 
expected, according to the discussion following equation (17.11). 
The starred coefficients are of course the same as those in (23.3). 
It should be mentioned that the equations (24.13) have not been 
reliably checked, and so may contain errors. Their complexity, as 
compared with (23.3), is striking. Moreover, the labour involved in 
their derivation is probably no less than that required to get the 
seventh-order equations (23.5). However, in practice one may be 
content with results which are approximate to the extent that we 
consider e’ merely as the sum of the displacements of order 2n —1 
generated by the aberration function of that order alone; under 
which circumstances the discussion of Sections 19-21 would be 
directly relevant. In other words, one merely substitutes y, for o 
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and y, for pin (17.10). In general terms this means that if any given 
ray be specified by the values of its coordinates y,, y;, then the series 
(17.10) gives the displacement not for the ray in question but rather 
for that ray whose coordinates have the values y,+ Oyz, Yit°Y1; 


where dy = (—Syi+ Oy)A, ay. =(Ry—Py) A, (24-14) 
with 1/A = (1 -—P)(1—S)— OR. The coordinate differences (24.14) 
are obviously at least of the third order; and we may expect the error 
committed by their omission to be small for reasonably well 
corrected systems. (See also the discussion following shortly after 
equation (69.5).) 

Despite what has just been said the presence of the lower-order 
coefficients in the effective coefficients of a given order is of course 
important in principle. A strict discussion of the existence or 
otherwise of focal curves, for instance, plainly requires the use of the 
exact coefficients w/”), a of the various orders. 

25. The total secondary displacement 

Since the primary effective coefficients are those which enter into 
(19.2)—cf. (23.2)—the primary displacement continues to be 
given by (19.5). The secondary total displacement on the other hand 
is to be obtained from (23.1) and (23.3). One easily finds that 


€ys = 6(8, + py) p* cos O + [(352— 8p; + $P2) + 2(S2—2P1 + po) 

x cos 24] pth’ + [(453 + 254+ Op; — 5P2+ 3Ps + Da) 

+ 2(S4—2~.+p,) cos?@] cos 0 ph’? + [(2s5 + 3s, 

+ 3p2—4P3—3Pa+Ps) + (55+ 357+ 322 — 2Ps— 324 

+ $5) cos 20] p*h’? + (256+ 253+ 3P3+ 3Pa— 32s) 

x 0080 ph'#-+(5,-+3ps) A, Cs) 
€25 = 6(5, + py) p° Sin O + 2(s,— 2p, + po) pth’ sin 20 

+ [(483+ 2P, —Po+ 3Ps) + 2(S4— 2P2 + Py) cos? GO] 

x sin 6 p*h + (85+ 3P2—2P3—Pat 3Ps) 

x sin 26 p*h' + (254+ ps3 —ps) sin @ ph’*. 
The general character of the various partial displacements is 
essentially the same as that described in Section 20. This is not to 
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say that all previous results remain unmodified. For example, one 
now has, in place of (20.6), 

Kts/Kg5s = (58.— 12), + 2P2)/(S2— 401 + 32). (25.2) 
This kind of modification of previous results is evidently of a 
rather trivial nature. None the less, one should bear in mind that 
in the presence of (primary) spherical aberration one has, in principle 
at least, some effective secondary circular coma 


A 


9 = —4p,(2+00820) pth’, 2 = —4p,sin20pth', (25.3) 
which should be compared with (19.12). 

In conclusion, it may be worth writing down the displacement for 
all p and 0, correct to the fifth order, when the displacement is known 
to vanish for all meridional rays. Writing down the various con- 
ditions which ensure this state of affairs, one finds that 

6), = — 2(Sy+p,) sin? 6 cos 6 p*h’ — (s, — 2p,) sin? 6 p?h’, 

ef = —2pysin Oph’? — [(454-+ 3s) — 2(54+P4)cos?O] sin Op%h® + (25.4) 

— (S_—p,) sin 20 p*h’? +- (2s, — py) sin 8 ph’*. 

These expressions are of interest because in the course of design a 
great deal of emphasis is often laid in practice on the performance of 
systems in the meridional section alone, supposedly as a measure 
of its over-all performance. It will be seen that even in the absence 
of curvature of field (p, = sg = 0) there is a residual secondary dis- 
placement, made up of certain amounts of oblique spherical aber- 
ration and elliptical coma. 


Problems 


P.4(i). The reduced magnification associated with a pair of con- 
jugate planes is #. Determine their location, i.e. the quantities 
q and q’, in terms of s, m and f. 


P.4(ii). Equations (14.5) hold for arbitrary rays if one understands 
A, B, C, D to be functions of €, 7, ¢. Show that (14.7) is still valid. 


P.4 (iti). The ideal behaviour of a certain symmetric system K 
is prescribed as follows. K forms a sharp, but possibly distorted, 
plane image in %’ of a plane object in %. Show that KH has the 
generic form 

Vy = a(6)—(d? +£—2Dy + DO), 
where D is a function of ¢ only. What is the significance of D(é)? 


6 BIT 
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P.4(iv). K is a symmetric system of which it is desired that it 
should form a sharp image of a plane object in 4, this image lying 
in some surface of revolution /. Show that K has the generic form 
Ve = 2(6)—(d’2+€-—2Dy+K)t, where D and K are certain func- 
tions of € alone. What is the significance of these functions? What 
is the curvature of F at its pole, if ky = (dK/d€),_9? 

P.4(v). Obtain the displacement e’ in the ideal image plane 7’ 
for the characteristic function \ of the previous problem. 


P.4(vi). Justify equation (15.14). 
P.4 (vii). Deduce the result (15.20). 


P.4(viii). Let the zdeal imagery of a certain system K be implicit 
in its ideal point characteristic (15.6). Suppose that its actual point 
characteristic V is that given in problem P.4 (iii). Determine the 
third- and fifth-order aberration functions v® and v® of K. Also 
determine v if its point characteristic V is that given in problem 
P.4 (iv). (Note: set d’ = 1 throughout.) Consider your results in 
the light of equations (23.2-3). 


P.4 (ix). Justify equation (19.11). 
P.4(x). Obtain the result (21.7). 


P.4(xi). Determine the conditions which must be satisfied to 
ensure that, correctly to the fifth order, the image patch should be 
exactly symmetric about a line parallel to the 2-axis. 


P.4 (xii). If K is the system of example P.4 (iv), let it be further 
given that effective distortion of all orders is absent. Show that one 


must have D=(1+ o)4 (1 +Ky)t. 
(Take d’ = 1. Note that distortion refers to #’.) 
P.4 (xiii). Continuing the preceding problem, show that the 
equation of the image surface is 

X’ = d,(Y'+Z") + (d,— dj) (Y'2+Z’%)? 4+ O(6). 
P.4(xiv). The point characteristic of a certain system K is 

V = g()+3f(u), where u=f—29+6, 

as usual. Show that all rays from a given object point O which make 


a fixed angle with the axis of K in the image space intersect ¥’ in 
a circle. Is this circle concentric with 1’? 


CHAPTER 5 


THE SYMMETRIC SYSTEM (PART II) 


A. The sine-condition 
26. Spherical aberration and circular coma 


In certain situations which occur in practice the image height is so 
small that all aberrations varying non-linearly with h’ may be com- 
pletely disregarded. This means that only spherical aberration 
and circular coma need to be considered. Bearing in mind that of 
these two types the second can be controlled by suitably adjusting 
the position of the stop, it is of importance to have at hand certain 
general results which can be used in a simple way to test for the 
presence of circular coma, without the need to examine anything 
but axzal rays, i.e. rays through the axial point Oy of ¥. 

To begin with, all terms of the aberration function varying non- 
linearly with h’ are now to be omitted, so that v has the generic form 


v = S(E)+7C(&), (26.1) 
where S(é) = E584, C(é) = See (26.2) 


Here s,_4, Cp, are the (characteristic) coefficients of spherical 
aberration and circular coma, previously denoted by 0%) and off) 
respectively. 

We shall frequently have to refer to the exact displacement in- 
duced by (26.1), and we proceed to write it down. To this end it is 
desirable to introduce a number of abbreviations. Thus let 
w=(1+)?, P=1—2wS, O=1 —20C, R=1 +wC, 

J = (w 2\-4 _ ar S2 } (26.3) 
= (w*—EP%)-4 = (1 + 4wES — 4wES?)-4, 
where a dot denotes differentiation with respect to £. Note that the 
various functions just defined depend upon p alone. Always 
rejecting terms depending non-linearly upon h’, we find that 


B’ == [-(P+Qn/w*)y’ +Ryil, 
” (26.4) 
B = —2mgcy,—~ (Ry'—y;)- 
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A somewhat tedious calculation, based directly on (17.1), then yields 
the required result 


e = [(1—-JP) + P(P?R—-Q) gl y’—(1—-JR) yx. (26.5) 
If o stands for spherical aberration alone (0 = 0), we thus have 
ao = p(1—JP), (26.6) 
whilst the tangential and sagittal circular coma are given by 
Kk, = —h'[1-JR+£J9(Q—P*R)], x, = —h'(1—JR). (26.7) 


We emphasize that equations (26.5) (26.7) are exact, and all orders 
of aberrations are included. 


27. The sine-relation 


For the sake of orientation we first contemplate a pair of perfect 
conjugate planes. The point characteristic of K associated with 


these is, as usual, Varn Gs, (27.1) 
in the notation of Section 17. Then 


B—mB' = — 2my,g,(6). (27.2) 
For axial rays y, = 0, and the right-hand member of (27.2) vanishes. 


If ¢, ¢’ are the angles which the initial and final rays make with 
the axis of K, one therefore has 


sin d = msin g’. (27.3) 


This result constitutes the so-called sine-relation. It will evidently 
be valid under conditions much less stringent than those assumed 
above, for (27.3) will be unaffected by the addition to the right-hand 
member of (27.1) of any aberration function v which depends non- 
linearly on h’, i.e. which is such that 0v/dy’ and dv/dy, both vanish 
when y, = o. In other words, the sine-relation holds if % and 7’ 
are perfect in a region around Og, the linear dimensions of which are 
O(1). In such a region only spherical aberration and circular coma 
(of all orders) play a part; and we conclude that the complete ab- 
sence of these entails the validity of (27.3). 

(27.3) requires a slight modification when the object is at infinity, 
which is most easily obtained by a limiting process. If a meridional 
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axial ray intersects @ in the point whose ¥-coordinate is y,, one has 


Ye _ SMYe 
rer many (27-4) 


in view of (14.28). Now divide throughout by m, and then let m 
tend to zero. According to (27.4) tan /m tends to the value — y,/f, 
so that, when m = 0, (26.3) is to be replaced by 


fsin 6" = — Ye. (27-5) 


tan gd = 


28. The sine-condition 


In view of the remarks following upon equation (27.3), we go on 
to investigate the consequences of the sine-condition, i.e. the con- 
dition that the sine-relation (27.3) should hold. We therefore con- 


sider the quantity T(p) = sin f’ —m—sin ¢. (28.1) 


Since only axial rays are contemplated here, we may set 6 = 0, 
and of course y, = o. Then, at once, from (26.4), 


D(p) = p@S+C) = 3 (ansy en) (28.2) 


If this is to vanish for all values of p the sum 2S + C must vanish; 
and this is the necessary and sufficient condition for the validity 
of (27.3). 

In practice I'(p) will of course in general fail to vanish for all 
values of p, and then (28.2) is not very informative. Even when S$ 
vanishes for some value of p it tells us only that C(p”) = o for that 
value of p. This, however, does not lead to any exact statements 
about the true displacement. At best we can proceed approxi- 
mately by considering the pseudo-displacement *e’ which is the dis- 
placement calculated by omitting from the effective coefficients of 
order 2” — 1 all coefficients of order less than 2” — 1. In the notation 
of Section 11, ia . 
te! = Sein alo. (28.3) 


Continuing to reject all terms non-linear in /’, 
*¢ = 2.cosOpS+[(1 + cos20)p?C+C]h’, 
*¢! = 2sinOpS+sin20p?Ch’. (28.4) 
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The pseudo-displacement due to spherical aberration is 2S, 
having set 0 = 0; whilst, on setting 0 = 47, the sagittal pseudo- 
displacement *x, due to circular coma turns out to be Ch’. We 
have thus inferred that if for some value of p both I’ and S vanish, 
then *x, also vanishes. This result is defective in two ways, first, 
because it relates only to pseudo-displacements, second, because 
two separate conditions must be satisfied before anything can be 
said about the vanishing of C. We shall therefore improve upon this 
situation in various ways in the following sections. 


29. The modified sine-condition 


We just saw that the vanishing of I for a particular value of p led 
merely to a conclusion which involved $ and C jointly, and was 
therefore of little practical value even if pseudo-displacements were 
regarded as a sufficient approximation to true displacements. A 
rather trivial modification of the sine-condition, however, allows 
us to consider C separately from S. 

To this end, let an axial ray intersect &’ in the point D’ and the 
axis of K in the point P’, so that the previous angle ¢’ is the angle 
between the axis and the line D’P’. We may instead consider the 
angle d’ between the axis and the line D’Oj. By inspection, 

sing’ = —p(1+p*)?, 
so that sing—msin gd’ = —mpC (p?). (29.1) 
In place of I’, define the quantity 
A(p) = sin’ —m-sin $ = pC(p?). (29.2) 
(The case of infinite object distance is easily accommodated as 
before.) Then the condition A(p) = 0 will be called the modified 
sine-condition. 

If now A(p) = 0 for all values of p, then all (characteristic) co- 
efficients of circular coma must be zero, irrespectively of the pre- 
sence of spherical aberration; so that we have a much simpler 
theoretical situation than that which obtained with regard to 
the original sine-condition. Moreover, if A vanishes for some 
particular value of p, then the total sagittal circular comatic pseudo- 


displacement vanishes for that value of p. Of course, again nothing 
can be said about the value of the tangential coma. 
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30. Axial rays and approximate displacement 


The displacement e’ is implicit in the functions S and C, the actual 
connection between them being provided by (26.5). In this 
section we investigate how one might obtain e’ for all values of 
0, provided one is content with an approximation which amounts to 
the assumption that the effects of spherical aberration and circular 
coma of orders greater than 2k+1 are relatively small, where k is 

some integer which one will, in practice, take to be 2 or 3. 
To begin with, observe that for axial rays equations (26.4) at 
once give ' 
B C=--——-, (30.1) 


Now let 2 rays, having p = Pj, Po, ---,P, respectively, be accurately 
traced through K. For each such ray the corresponding values of 
S and C (S; and C,, say, for the jth ray) may be read off from the 
traces. Next, truncate the series (26.2), each after its Ath term. Then 
one has two separate sets of k linear equations 


k k 
a+ 1) 7" Sp =] S;, 2 Pi en a Cy, (30.2) 
= q= 


whose solutions give the (approximate) values of the unknown 
characteristic coefficients s,,..., 5, and c,, ...,¢, respectively. Once 
these are known, the effective coefficients of spherical aberration 
and circular coma may be obtained after the fashion of Section 23, 
the situation being relatively simple here since only the coefficients 
u), ul? and u\®) (n = 2,...,k+1) are required here. (For n = 2,3, 4 
the results are already given by (23.2), (23.3) and (23.5).) In short, 
one now has an expression for the displacement which provides 
not only the sagittal circular coma, but the circular comatic dis- 
placement for rays having any value of 0. 


31. Offence against the sine-condition 


We saw in Section 29 that the value of A(p) is an immediate measure 
of the extent to which a circular comatic term is present in the 
aberration function. It would therefore be appropriate to refer to 
A(p) as the ‘offence against the modified sine-condition’. However, 
we also saw that a knowledge of the value of C for some value of p 
did not imply an accurate knowledge of comatic displacements, 
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whether sagittal or otherwise; in the sense that we had to be content 
with pseudo-displacements. From a practical point of view this is 
not a happy state of affairs, and we must seek to improve it. 

In view of the close relationship between A on the one hand and 
*x, (rather than *x,) on the other, we first turn our attention to the 
exact total sagittal circular comatic displacement x,. We shall 
usually refer to this simply as ‘sagittal coma’, but the true meaning 
of this term, as here intended, must constantly be borne in mind. 
Accordingly, reference to (26.7) shows that we are concerned with 
the quantity 


K =JR-1. (31.1) 
Now, when y, = 0 (and 0 = o), 
wp" - —pP, wp/m ms —pR, (31.2) 
according to (26.4). From these it follows at once that 
JPB 
K= = I, (31.3) 


However, let ¢ be the longitudinal spherical aberration, i.e. the 
distance from Oj, to the point in which the axial ray given by p 
intersects the axis. ‘Then we have 


6 = o(p—o) = 1/JP-1, (31.4) 
because of (26.6). (31.3) therefore becomes finally 
K=-34+ yeaa (31.5) 
It is easily confirmed that this may be written in the equivalent, 
simple form B 
sae aaa (31.6) 


It is remarkable that, except for an over-all change of sign, 
Kis exactly the quantity called ‘the offence against the sine-condition’ 
by A. E. Conrady in Part I of his work Applied Optics and Optical 
Design (Dover Publications, 1957, chapter 7, p. 370), and denoted 
by him by the composite symbol OSC’. It is rather striking that so 
simple a relation as K= —OSC’ should obtain exactly, since the 
details of Conrady’s work are quite difficult to compare with what 
has been done above. This is due in part to the fact that his ‘coma’ 
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is not directly defined in terms of displacements in the ideal image 
plane. Again, we have taken sagittal rays to be those for which 
6= +47, but Conrady defines them to be rays which have 0, = + 47, 
where @, is an angular coordinate in the entrance pupil; and these 
alternative definitions are not exactly equivalent. 

At any rate, one must be careful not to interpret the phrase 
‘offence against the sine-condition’ in the light of (28.1); for 
when K =o the quantity ['(p) of equation (28.1) will have the 
value — £’6, rather than zero. In other words, one is essentially 
concerned with another ‘modified sine-condition’ I'*(p) = 0, 
where I* is like ', but with the magnification m replaced by the 
‘apparent magnification’ m(1 +). (With an arbitrary unit of length 
one must of course write m(d’ +0) here.) Thus 


T'*(p) = sind’ —[m(1+4)}-"'sing =—Ksing’. (31.7) 


In conclusion we restate the main result of this section: if K 
has been calculated from (31.5) for an axial ray given by some value 
of p, then, irrespectively of the presence of spherical aberration, 
the circular comatic displacement x, of sagittal rays, specified by the 
same value of p and by the ideal image height h’, is exactly Kh’. 


32. Offence against the sine-condition and tangential 
circular coma 


We have just seen that the trace of a single axial ray of ‘semi- 
aperture’ p allows one to calculate the offence against the sine-con- 
dition K, and that this is exactly equivalent to a knowledge of the 
total circular coma xk, for sagittal rays of the same semi-aperture. 
Unfortunately it does not tell us anything of substance about the 
tangential circular coma k,. It is easy to see why this should be 
so. According to (31.2) a single axial ray trace allows us to evaluate 
the functions P and R, and hence K, for the value of p in question. 
It cannot, however, yield the value of Q; yet this quantity enters 
into x, Evidently, to calculate x, we require, at least implicitly, 
the value of the derivative of C. In effect this means that the best we 
can do is to trace several axial rays, and obtain the value of C, or 
of some quantity equivalent to it, by a process of interpolation. 
Although one is thus confronted with a problem somewhat more 
involved than that relating to the calculation of «,, it should be 
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borne in mind that we continue to operate with axzal ray traces alone. 
We proceed to develop the required details. 
According to (26.7), if K = x,/h’, 


K = £7>(P?R—Q)+K. (32.1) 
It is easily confirmed that 
wJ = 1/a' = secg’, (32.2) 
so that, using the definition of J, 
pJP = —tan ¢’. (32.3) 
Again, QO = 1—208C = — 2w*d(R/w)/d£, 
whence EPO = —2w*J*Ed(R/w)/dé 
= —sec® ¢’pd[(K + 1) cos $’]/dp, (32.4) 


because of (31.1) and (32.2). (32.1) now becomes 
K = tan? $'+ Ksec? ¢’+ sec? ¢’pd[(K +1) cos $’]/dp. (32.5) 
This equation can be rewritten in an illuminating form if one intro- 


duces the longitudinal spherical aberration 6, though this quantity 
is strictly speaking redundant here. ‘Thus one has 


d= —1-pceotg’, (32.6) 
which is consistent with (31.4) and (32.3). One then obtains in 
place of (32.5) the equivalent equation 


K = sec? ¢’ (Z (pcos ¢’K)— sin? d’ =) F (32.7) 


which constitutes the desired result; see also equation (33.5). 

To calculate K one has to plot pcos ¢’K and 6 against p by reading 
these quantities off from a number of exact axial ray traces. (These 
will usually have been obtained in any event, since one is not likely 
to be content with the value of K for one single value of p.) The 
slopes of the curves can be read off directly for any chosen value of 
p, and then the value of K follows immediately. 

Having thus calculated the values of K and K the total displace- 
ment for all values of p and @ is given by 


e, = oc0s0+3[(K+ K)+(K—K)cos26]h’, (32.8 
ef = osin0+4(K—K)sin20h", } 32.8) 


terms not depending linearly on h’ having of course been rejected 
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as before. In conclusion we note that when spherical aberration is 


absent (32.7) reduces to , 
ee K - (32.9) 


33. Two theorems relating to circular coma of all orders 


This section concerns itself with two results, the first of which 
follows trivially from (32.7), but which is nevertheless of consider- 
able interest. 


(i) On tsoplanatism and aplanatism 

A system is called zsoplanatic if circular coma is completely absent 
(i.e. there is no asymmetry of the image for sufficiently small values 
of h’); and it is called aplanatic if, in addition, spherical aberration is 
completely absent. (An aplanatic system therefore forms a perfect 
image of a plane object in.% lying in a sufficiently small neighbour- 
hood of Oy.) (32.7) now shows that absence of circular coma, i.e. 
K = K = ofor all values of p, requires that 6 = 0 for all p. There- 
fore an tsoplanatic system ts necessarily aplanatic. 

It appears that with our strict interpretation of the terms ‘aplana- 
tism’ and ‘isoplanatism’ the latter must be regarded as redundant. 
The distinction arose in the first place by a restriction to third- 
order displacements alone, for under these circumstances isoplana- 
tism only requires that p, be zero, whereas aplanatism requires 
that p, vanish as well. Likewise, proceeding to the fifth order, (25.1) 
shows that isoplanatism merely entails that s. = p, = p, = 0; but 
the vanishing of s, is not required. More generally, if one consist- 
ently goes to order 2m —1 but no further, the presence of the factor 
sin? ¢’( = O(3)) in the second term on the right of (32.7) brings with 
it that isoplanatism imposes no condition on the coefficient of 
spherical aberration of order 2n—1. However, the rather artificial 
nature of this result is evident. 

An alternative possibility is to take isoplanatism to mean the 
absence of the circular comatic term from the aberration function, 
i.e. C = o. In view of (28.4) there is then no circular comatic pseudo- 
displacement, and in the primary domain the classical situation 
obtains. There is, of course, a residual effective comatic displace- 
ment. In fact, one finds that 

K=f-1, K=J-1. (33-1) 
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Provided J + 1 the comatic ratio for any value of p is therefore 
K/K = /?+J+1. (33.2) 


For sufficiently small values of p the right-hand member of this has 
the value 3, consistently with (25.3); but the present result is more 
general since it remains valid even if spherical aberration is absent 
to some finite order (cf. equation (25.3)). 


(1i) Szne-condition and the comatic ratio of order 2n —1 


In this section, let y,,_, denote the exact circular comatic ratio 
Kian—1/Kson—1- We have already seen—for instance in equation (25.2) 
—that this does not in general have the value 2m —1 since the co- 
efficients of spherical aberration and circular coma of the lower 
orders enter into it. The question naturally arises whether a system 
can be such that x5, = 27—1 exactly, for all 2; and under what 
conditions this situation will obtain. To solve this problem we first 
rewrite equation (32.7), introducing for this purpose the ‘sine- 
ratio’ ®, defined as 


® = sin d'/msin ¢; (33-3) 

so that p(K+1) = —Ptan ¢’. (33-4) 
The first term on the left of (32.7) may be transformed as follows: 

sec? d’ Fp (pcos # K)= Bot) n? d’ — 

dp dp 
—tan? ¢’+ pK sec? ¢’ e008 ee 

Then (32.7) becomes 
1” d(pK) 3 a dtan 2 
K= ae ~tan® 6 ra + (p(K+3)tan g's ze —tan? ¢’ 


+ pK sec® g’ d cos et tan® p’ d(p . e /) : 


It is readily confirmed that the expression in the large parentheses 
vanishes identically, so that 


FON) 55D 
K= do tan d dp > (33-5) 
and the relation ® = (1+ K)(1 +8) (33-6) 


should be borne in mind. 
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Now the power series for K has the generic form 


K=% Kp", (33-7) 


where K, relates to order 27 +-1. If y,,,_, is to have the value 2n—1 
for all x, K, written as a power series, must be 


K = 3 (ar+1) K,p*. (33-8) 
r=1 
When this is the case, however, 
pe _ UK) 
eer ae (33-9) 


by inspection of (33.7) and (33.8). According to (33.5) it then follows 
that ® = const. = 1; conversely, given ® = 1, (33.8) follows from 
(33-7). On the other hand the condition ® = 1 means precisely that 
the sine-condition is fulfilled. Accordingly we have proved that 
the circular comatic ratio of order 2n—1 has exactly the value 
2n—1 for all n tf and only if the sine-condition is satisfied for all 
values of p. 

We already know from (28.2) that if ® =1 then 28+C =o, 
so that the comatic displacement is implicit in spherical aberration. 
In fact, recalling (33.6) and (33.9), we see that, exactly, 


K=-o/p, K=-—do/dp. (33-10) 


34- Cosine-relations and cosine-conditions 


The preceding discussion of the sine-condition was presented at 
some length largely because it has been a useful tool in design prob- 
lems in the past; and partly to analyze clearly its function and 
meaning. In particular we wanted to bring out clearly its direct 
connection with total sagittal rather than tangential coma; the way 
the presence of spherical aberration is allowed for; and the fact that 
it concerns itself with circular coma alone, and so has nothing to do 
with non-linear comatic types. 

It is possible to obtain generalizations of the sine-condition which 
are intended to say something about non-linear comatic types. 
On the other hand, they seem to lack any sort of usefulness in 
practice, and we therefore do not deal with them. However, for the 
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sake of completeness we shall briefly discuss the so-called cosine- 
relations. 

As in Section 27, consider first a pair of perfect conjugate planes, 
so that equation (27.2) obtains. The right-hand member of this 
depends on y, only. It follows that for all rays through any point O 
of the object 

B-mf' =const., y—my' = const., (34-1) 
and these are the cosine-relations. The requirement that (34.1) 
should hold for all rays through any given point of Y constitutes 
the cosine-conditions. They require that (0V/éy’) + V/éy,) depend 
upon y, only, ie. 


y (2k, +V,)+yi(V, + 2h) = function of y,, (34.2) 


for arbitrary values of y,. (34.2) splits up into the two separate 


conditions 2V;+V,=0, V,+2V = function of €, (34-3) 


and these are satisfied if, and only if, V is the sum of a function of € 
alone and a function of u( =  —29+¢) alone. Accordingly 

= e(C)—(1 + ui + o(u). (34-4) 

Evidently satisfaction of the cosine-conditions does not imply 

absence of aberrations. Given (34.4), the exact displacement is 

e’ = (y’—y,)[1—-(1-9)(1+2ug—ug?y 4], (34-5) 

where g = 2(1 +u)*dv/du. If o(p) is the usual spherical aberration 

é, (p,h' = 0,0 = 0), (34.5) may be written 
e’ = (y’—y,) ut o(ut), (34.6) 


though we might equally well have introduced distortion, for in- 
stance. At any rate, in the absence of spherical aberration satis- 
faction of the cosine-conditions ensures perfect imagery for the 
conjugate planes in question. 

The last result is a generalization of the condition 2S+C =o 
which we met in Section 28. It is instructive to recover it here. We 


have u(u) = v(E)—20,() 9 +..., 


ie. S(E) = o(£), C(E) = —22,(&), and so 28 +C =o, as was to be 
shown. 
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B. The spherical point characteristic V', wavefronts 
and aberration functions 


35. The characteristic function V(x’, y’, 2’, y, 2) 


Hitherto, from Section 6 onwards, the explicit appearance of the 
variable x’ in the point characteristic V was prevented by the 
convention of choosing a fixed posterior base-plane, and so in effect 
simply setting «’ = o. On various occasions it is, however, important 
to know how V in fact depends upon x’. For example, given this 
dependence, that of the aberration coefficients (on x’) is also 
known. By this is meant the following. If the origin of coordinates 
in the image space is taken at the previous base-point B’, we may 
contemplate a new base-point B’ which is at a distance &’ fom B’. 
If the coordinates of points in the new base-plane are denoted by 
y’, and the corresponding rotational invariants by E, 9, € (where 
€ = 6, of course) then 


V = g(€)—[(1 8’ +a} + 0(8"; 9, ©). (35-1) 


V is physically the optical distance between points in the object 
plane and the new base-plane. The function g has been taken as 
fixed by prescription. This is permissible since the paraxial term of 
g is in any event independent of the position of B’, whilst the part 
of @ which depends upon @ alone, i.e. 4(%’;0, 0, &), will take care 
of all other terms. Except for the ubiquitous provision of circum- 
flexes, the representation of 0 as a power series is formally the 
same as before. The aberration coefficients are thus 0”)(%’), their 
dependence upon 2’ now being indicated explicitly. In particular, 


vm) = Oo). (35-2) 


One way of proceeding from here would be to insert (35.1) in 
the first of the differential equations (3.6). The result is a sequence 
of comparatively simple ordinary differential equations for the 
0”), obtained by equating the factor multiplying £"-“7“"& to zero. 
These equations must then be solved, subject to the initial con- 
ditions (35.2). It turns out, however, that this method is unneces- 
sarily cumbersome, and we abandon it. 


A more advantageous procedure is the following. Independently 
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of the variables on which the functions V and V depend, we may 
it 4s At , 
7 V = (V+8)/al. (35-3) 
On using (35.1) this gives 
6 = 04 [(r-8')P +a} —(1-+u)h 48/0’. (35.4) 


Though we are ultimately aiming at 0, regarded as a function of 
£, 9, €, we evaluate the right-hand member in the first place as a 
function of £, 7, ¢. We temporarily introduce the abbreviations 


A=2y,-y).e, Boee (35-5) 
in terms of the notation suggested just after equation (14.4). Note 
that A and B are O(4) and O(6) respectively. In the first place we 


now have 


t/a’ = (1+u+A+B), (35-6) 

since Bi/a’ = yy +e, (35-7) 
which is just (17.1). Again 

Vy = (1 -F)(y'-y) +H, (35-8) 

so that i = (1-2)? (u+-qA+q°B), (35-9) 

where g = —&'/(1 —2’). (35-10) 


(35.4) now becomes 
§ = vt(1+ujh{(1—2')[1+(1+u) (GA +@B)} 
+2'[1+(1+u) (44+ B)}- 3}. 


Every square root of the generic form (1 +X’)? is now to be written 
as 1+4X—1X?...; and it is after this step that the various orders 
are to be sorted out. The constant terms of 8—vw cancel out, as do 
the terms linear in A. One thus gets 


® = v—4q(ey3 + €,3) + O(8), (35.11) 


which implies reasonable simplicity for orders three and five; and 
we confine our attention to these. 

It remains to express &, 7, € in terms of E 9, €. At this stage we 
recognize the virtues of having allowed e’ to appear on the right of 
(35.11). At first sight its presence might seem to be a great complica- 
tion here, bearing in mind that e’ is related to the derivatives of v in 
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a complicated way. However, the situation is in fact quite simple. 
In the first place, the displacement is a quantity independent of the 
manner in which it is expressed; so that e’ may simply be replaced 
by é’. This means, explicitly, that e’ = 4d’p,y’£+... (cf. (19.2)) is 
to be replaced by é’ = 4d'B, 9'€ +... . In other words, since here 
d'=1andd' =1-2',é' is formally obtained from e’ by supplying 
all its variables and coefficients with a circumflex, and providing 
one common additional factor 1%’. How é’ depends on the 0%) 
is already known from the work of Section 23. On the other hand, 
the terms of lowest degree, i.e. those of degree 4, appear on the 
right of (35.11) only in its first term; so that this alone needs to be 
retained to determine the 0®). Going to terms of degree six, the 
coefficients which occur in the second term on the right of (35.11) 
are already known. Evidently one has a comparatively simple itera- 
tive process for getting the 0%), O{), ... in turn, each step making the 
greatest possible use of the work done at preceding steps. 
From equation (35.8) we easily find that 


y,oAr 


E = 2 + 2cqf +g? +29q(c¥' +qy1).€' +9’ .@’, 
1 = ch+go+ge’ ys, (35-12) 
¢ = g 
where c = (1—#’) 1 = 1-4. 
As far as the primary coefficients are concerned, the terms in- 


volving é’ in (35.11) and (35.12) are to be ignored, and one is left 
with a simple linear substitution. ‘This at once leads to the relations 


Bz = O(P2+ 49P1), 

Ps = (s+ 92+ 29°P1), 

Ba = (pat 29P2+ 49°Pr), 

Bs = (Ps + 2gP4+ 29P3+ 39°P2+t 49°P1), 
Bs = (Pet 90st Ghat Pbst+ Ppat YPi)- 


(35-13) 


Proceeding to the fifth-order coefficients, the terms of degree 4 
in (35.12) must be retained. The equations corresponding to 
(35-13) become very lengthy and we therefore give the results for 


7 BIT 
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s, and s, only. Thus 
$1 = c%(s, + 89p}), 
= C7[(s2+ 6951) + 489°} + 129p) po]. (35-14) 


We shall have occasion to contemplate these equations later in a 
different context. At any rate, we have now the means systematically 
to determine the functions 0{})(%’) to whatever order required. 

Let this have been done. Then we change the notation as follows: 
we omit the circumflex everywhere, for it is now redundant. How- 
ever, some distinguishing mark has to be added to the constants 
previously denoted by v% and we write them here as ff); that is to 
say, the of} are the abectation coefficients relating to the usual base- 
plane %’ = 0; and one should therefore now write d in place of v. 
In short, what we have before us is exactly the point characteristic 
V(x’, y’, 3’, y, 2), with its dependence on x’ explicitly restored. Thus 


V(x',9's2,y, 2) = (6) [2-2 P + 8-29 + C8 


2) > x UT (x) Gree”. (35-15) 


n=2 p=Ov= 


36. Arbitrary posterior base-surfaces 


The conventional point characteristic V(y’,2’,y,2) is recovered 
from (35.15) by simply setting x’ = o. There is, however, no reason 
why we should be restricted to taking some normal plane to serve as 
posterior reference surface—if, indeed, we want to choose such a 
surface at all. On the contrary, we are not prevented from setting 


x" = x(9', 2", Vi 21) (36.1) 


in (35.15), where y can be chosen in any convenient way at all, 
subject, of course, to the usual condition of non-conjugacy. Any 
ray is then specified by the values of y, and of the coordinates y’ 
of its point of intersection D’ with the surface @*’ whose equation 
is %’ = x(¥",2’,¥, 21). Then, using (36.1) and taking ¥ = o as usual, 
(4.1) becomes 


dV, =a’ (% dy’ rok dz' + Mays x as) 


+ Bp’ dy'+y'dz' — Bdy—ydz, 
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the subscript on the left serving as a reminder that the base-surface 
is now B*’. In place of (6.1) we have here 

ey oy 


; _ _—M& , ., OX 
) Bg : (36.2) 


dy 
Further, the aberration function v, is defined in the usual way by 
the equation 


VW = g(2)—{ x0 2, yp a) P+ $—29+G}8 +0. (36.3) 


It should be noted that y need not be expressible as a power series 
in §, 9, ¢, though, when it is not, much of the formal simplicity of 
the theory of symmetric systems will be lost. This would be the 
case, for instance, if @*’ were taken to be a fixed plane not normal to 
the axis of K (cf. the discussion of Section 8, relating to the idea of 
symmetry). Simplicity can, however, be restored by taking such 
a plane to be ‘movable’; i.e. one might require it to pass through E’, 
with its normal pointing in a direction parallel to the line E’!’, 
for then one has simply 7 = — 7. At any rate, we see the reason for 
letting x depend also on y, in (36.1). 

In (36.3) we have followed the usual prescription of defining the 
aberration function as the difference between the actual and the 
ideal characteristic functions. One might think of the ideal charac- 
teristic as ‘modified’ by the addition of the aberration function. 
It is obviously not mandatory that such a modification should be 
of just this kind; and in place of (36.3) one could, for instance, write 


Vy = a(S) tt — X02 yp 2) P+ 8-29 + F—26,}8. (36.4) 


This, then, is one possible definition of a modified aberration func- 
tion 0,; and we shall meet a special case of it at the end of the next 
section. 


37- The spherical point characteristic V+ 


Inspection of (36.3) at once suggests an advantageous choice of 
base-surface, namely that which, in effect, removes the awkward 
square root altogether. This obviously requires that we take @*’ 
to be a spherical surface with centre at J’; so that it is movable in 
the sense of the discussion at the end of Section 36. When it is 


7-2 
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normalized by the requirement that it pass through E’, we denote 
it by W, and the corresponding characteristic function V, by V*. 
It is natural to call V* the spherical point characteristic. (36.1) 
now reads specifically 


xf = 1—(1-E+2m)h, (37.1) 
On inserting this in (36.3) there comes 
Vi=g(O)te, ; (37-2) 
where g*(f) = (6)—(1 +¢)2. The relations (36.2) become 
B’ = dvt/dy! —a'(1 ~§ + 29)-4 (y’ —y). (37-3) 
As usual e’ = y’—y,+(1—2’)8’/a’, and in view of (37.1) and 
(37-3) we get the result aot 
where AX! = (1—x')/a’, (37-5) 


The simplicity of (37.4) is striking. In the first place, the exact 
displacement is just the derivative of v', all terms of the power series 
for which are merely to be multiplied by the common factor X’. 
Moreover, geometrically this is just the distance from D’ to O’ (see 
also Fig. 5.1), which must be nearly constant in practice if an 
image of reasonable quality is to be formed in .%’. Usually it will 
therefore be entirely adequate when calculating the total displace- 
ment to replace X’ by the constant factor R’, which is the distance 
from E’ to J’. (Of course R’ is a different constant for different values 
of ¢.) 

Even when the separate orders which go to make up e’ are con- 
templated in the usual way, the situation is rather simple. ‘Thus we 
have 


X’ = [(1— x’)? + (VY —y'P? 4 (Z’— 2’ = (1+ €+A4+ Bt, (37.6) 
in the notation of equations (35.5). It follows that 
X' = (1+ 8+ 3A + O(6). (37-7) 


The equations for the effective coefficients corresponding to (23.2) 
and (23.3) are obtained very easily here. The power series for v' is 
written in the usual way, all coefficients having a dagger as addi- 
tional superscript. The relations (23.2) can be taken over as they 
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stand, provided one writes pi in place of p, (2 = 1,...,5), at the 
same time replacing the asterisk by a ‘double-dagger’ {. This 
notation is required by the fact that the present effective coefficients 
relate to a series which is generically of the same form as (23.1), 
but in which the coordinates y’ are not standard, i.e. do not relate to 
&’. Hence one cannot write u\}), a"), or px, Pa» Sz» --» for the effective 
coefficients here also; and strictly speaking y’, &, 7, ¢ also should 
all be supplied with a dagger. 

As regards the secondary relations, the primary coefficients 
which enter into them are induced solely by the simple term 
4Ce3, so that, virtually by inspection, 


Sj = 6si, 5 a s3, 
53 = 453, 53 = 2Sds 

t <t __ t 
s3= 4s§+2pi, 53 = s$+4p2, (37.8) 
si = 2s} sf = 251 37. 

4 = 254, 4 = 357, 

t <t t 

S§ = 25§ +3, 53 = 253+ Pi, 
se = 2s§+pi, 5 = 55 +49}. 


Again, the contributions by the third- and fifth-order coefficients 
to the effective coefficients of order seven arise from the compara- 
tively very simple expression 
[(yvi—y’)-€3 — 807] €3 + 3865, 

bearinginmindthat év®'/dy’ = e; —4¢5. 
Here one writes everything first in terms of the p3,..., and only 
reverts to the p}, ..., in the end. In this way one finds 

tt = 8tj — 16pj?, 

ts = 613 + 16p]? —20pip3, 

tz = 6th+ 381+ 4Pip2—16pip3, 

tg = 4t4-+ 1Opi p2 — 8pi ps — Op", 

ts = 405 + 283+ 16p]p3+ 8pi pi — 4pips + 2p2"— 10p2 3, 

te = 4tg + 253+ 4P1 ps + 2p2b3 — 4P3" — EPL 

ty = 2t) + 4p? — 4p3Pi, 

tg = 2t3+54+8)3p3 + 4P3P4 — 2025 — 43 Pa, 
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i = 2t}+si+2pipi + 4p? + 4p3 pj — api p! — $3}, 
tig = 2t]o +55 + 2633 — 43, 


ti = t3 —4pi ps, 

ts = 201+ 4pi ps — 8p] pi — 302°, 

ts = tf +452 —4pips + pi? — 2p3p3, 

tg = 3t} + 8pi pj + 2p3" — 8p3 13, 

t5 = 2t3+Si+ 4pips + 2php3 + 4P3 Pi— 4p3p1 — 4P3P1, 

tg = t3 +483 + 2p3p3 — 2p3 P35 — bP2, 

ty = 4tl + 4p3pi — 4p1", 

tg = 3lia + B97 + 2p2p3 + 4P3pi + 4p1" — 4919}, 

tg = 2tis+53+ 2p3p3 + 4pips —ps°—4Pi, 

tio = tg +359 + D3" —$B3- (37-9) 


The relations (37.8) and (37.9) correspond exactly to (23.3) and 
(23.4), but are evidently much simpler than these. It may also 
be noted, with regard to the identities referred to at the end of 
Section 23, that equation (23.8) is valid here even in the secondary 
domain, on account of (37.7). The tertiary identities also are still 
quite simple, e.g. t3 —6¢7 = 16)? + 4), po. 

The examination and classification of aberrations presented 
earlier could of course equally well have been undertaken on the 
basis of v' rather than v, without any effect whatever on the substance 
of our discussion. On the other hand the use of V* rather than V 
commends itself for a variety of reasons, not the least of which is that 
the pseudo-displacement, regarded as a function of coordinates in 
W, is likely to be a much closer approximation to the true displace- 
ment than when it is considered as a function of coordinates in &’. 
This point is vividly brought out by contemplating the modified 
aberration function d* of equation (36.4) in place of v'. One confirms 


easily that at = (1+ Oot —Jot?, (37.10) 
and then (37.4) becomes 
,__ oot 
BS pg) OO): (37-11) 


If the characteristic coefficients are now understood to refer to @*, 
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there will be no primary coefficients in the equations (37.8), whilst 
the secondary coefficients and the terms linear in the primary 
coefficients will be absent from (37.9). In particular, up to the fifth 
order the true and pseudo-displacements are exactly equal. Other 
reasons for the preferential standing of V* will appear later. 


38. Equations of the wavefront, and the interpretation 
of the aberration functions 


Recall from Section 1 that the rays from a given object point con- 
stitute a normal congruence throughout the optical system; that is 
to say, they are orthogonal to a set of surfaces, here called wave- 
fronts. We focus our attention on the particular wavefront W 
which passes through E’. Evidently, when we have a pair of perfect 
conjugate planes, W will coincide with the surface W of Section 37; 
and W, may therefore appropriately be referred to as the zdeal 
wavefront. (‘The ‘ideality of the wavefront’, regarded as a function 
of y,, has the same inherent degree of arbitrariness as that in the 
aberration function, as discussed at the end of Section 7.) ‘The non- 
vanishing of the aberration functions goes hand in hand with devia- 
tions of W from W,, and it is therefore of interest formally to 
examine the relationship between this deviation and the aberration 
functions. A number of alternatives arise, and we consider them in 
turn. Considerable care has to be exercised here, on account of the 
fact that at various times one is simultaneously contemplating the 
specification of rays by coordinates referring to either &’, or W, 
or W,, and this is a situation which can very easily give rise to 
endless confusion, since one is inclined to write y’ every time for the 
coordinates in question. It would probably be better to write 
y’, 9’, and y”, respectively, but we shall use these symbols only for 
occasional reference. 


(i) The equation V(x’, y’, 2',y, 8) = constant 

Referring to equation (35.15), we see immediately that any wave- 
front has the equation V(x’, y’, 2’, y, 2) = const. The equation of W 
is therefore V(x',y’, 2", y,2) = V(0, 0, 0,952). (38.1) 
This may be solved for x’ as a function of &, 7, € (see the end of 
this section). We imagine the function v(0; 0,0, ¢) to have been 
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absorbed in g(€); and then (38.1) becomes 
[(1 —x’)?-+-u]2 = (1+ 6)? +(x"; &, 9, 8). (38.2) 


Since the coordinates of Q’ are x’, y’, 2’ (see Fig. 5.1 which, for the 
sake of clarity, has been drawn for 2’ = z, = 0) the left-hand mem- 
ber of (38.2) is just the distance QO’J’. The first term on the right is 
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E’I’ = Gol’, whence the geometrical interpretation of the function 
v in (38.2) is given by 

v(x’; &,9,6) = Q'Go, (38.3) 
the distance on the right being, of course, reduced, as always. 
It should be noted that in (38.3) x’ is itself to be written in terms of 
£, 7, ¢, i.e. by using the solution of (38.2) for x’. 
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(ii) The equation V (£, 4, 6)+s = constant 


Reverting to the ordinary point characteristic V the optical dis- 
tance from O to any point on a ray Z is V +s, where s is measured 
along # from its point of intersection D’ with &’. On the wavefront 
V +s is therefore constant, so that the equation of W is 


Vg, Is g) ae V(o, 0, ¢). (38.4) 
Again absorbing z(0, 0, €) in g(€), (38.4) becomes 
sS= (1 +ujt—(1 + 6)#—a(E, 9, ¢). (38.5) 


This equation for W has a most inconvenient form from a practical 
point of view, the variables here referring to &’. However, this need 
not concern us, since we are mainly interested now in obtaining 
a result corresponding to (38.3). To this end, note that s = —Q’D’ 
by definition, whilst the first and second terms on the right of (38.5) 
are D'I’ and Gol’ respectively. Hence 


6=QO'D'+D'I'-Gol'. (38.6) 
This curious result reflects all the complexities of d as revealed in the 
cumbersome nature of (23.3) and (23.5). We notice in passing that 
g(¢) is the sum of E’I’' and the (optical) distance from O to E’. 
(ili) The equation V'(E, n, ¢)+s = constant 


We proceed exactly as in the case of V, and the equation of W is 


seas s = v'(&, 9, ¢). (38.7) 
Bearing in mind the definition of V', we see at once that 
ot = Q'0}. (38.8) 


The simplicity of this result as compared with (38.6) is striking. 
One must not forget, of course, that in (38.3) v appears as a function 
of 7’, d in (38.6) as a function of y’, whilst v* in (38.8) is a 
function of y*’. Of course, if desired, we may eventually express 
any one of them in terms of whatever independent variables we 
please. 


(iv) Deformation and retardation of the wavefront 


Without referring to any characteristic function in the first place, 
we can always imagine the equation of the wavefront to be written 


in the form (1—x')? + £—2y = 1+2D(,7, ). (38.9) 
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The function D thus defined will be called the deformation of the 
wavefront. It should be noted that x’ does not occur in it explicitly. 
Evidently y’ really stands for ¥’ here. The direction ratios of the 
wave-normals, i.e. of the rays, are —(1—x’), y’—y,— @D/@y’, so 
that the equations of the ray are 


x! = x'—(1—#')t, ¥ =y'+(y’—y1— OD/ay')t, (38.10) 
where ¢ is a parameter. Setting ¢ = — 1, it follows directly that 
e’ = 0D/[ey’; (38.11) 


and this equation is exact. In view of (37.11) D must evidently 
resemble &* very closely (cf. equation (39.10)); but we must 
not forget that these functions depend, in the first place, upon 
y’ and y” respectively. As a matter of fact, D may be looked upon 
as a modified aberration function in the following sense. In (36.1) 
chooee X= 1-(1-§+29+2D), (38.12) 
ie. B*’ is taken to be W itself, in view of (38.9). Then (36.4) 
Pecomes Vy = a0) (4S 2D 28); (38.13) 


where we must clearly understand that the actual wavefront W is 
being maintained as base-surface. On the other hand V, is of course 
a function of £ only, so that we must have D = @,; and the remark 
above is thus justified. Geometrically 


2D = (Q'I'’ (Gly (38.14) 


Of the various small displacements we have encountered we 
take Q’Qj to be the most important, partly because of the prominent 
position it occupies in diffraction theory. We call it the retardation 
of the wavefront, and denote it by the symbol A. 

Equations (38.2) and (38.9) are of course intimately related to 
one another. In fact, (38.9) constitutes just the solution of (38.2) 
for the ‘unknown’ x’. If in the latter one therefore replaces x’ 
everywhere by the right-hand member of (38.12), one has an iden- 
tity from which the relations between the 3%) and the coefficients of 
the power series of D may be read off. 
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39. The relationship between A and the aberration 
functions 
It is of some importance to have information at hand about the 
differences which exist between the retardation on the one hand, and 
D and the various aberration functions on the other. More exactly, 
we wish to obtain expressions for the dominant terms of these 
differences, so that we shall know up to what order they may in fact 
be strictly ignored. 
With regard to Q’Gg we may proceed very simply as follows. 
Since to a sufficient approximation Q G4 is normal to Q’ Go, 
O’G, = A—4W"A+..., (39-1) 
where y is the angle between Q’O’ and Q’I’. Writing simply e’ 
for (e’ .e’)?, we also have ee 
( ) poe /R'+.... (39.2) 
Now ¢’ = ¢3+ O(5) and A = O(4), so that the second term on the 
right of (39.1) is O(10). We therefore have 
v= A-(6;%/2R") A +O(12). (39.3) 
Thus no distinction need be drawn between v and A as regards the 
displacement of orders 3, 5 and 7. 
Next we come to %, where the state of affairs is not quite so 
straightforward. We first write (38.6) in the form 


6 = (QD'-QO'M’)+ (DT - MT) + Q's, (39-4) 
where M’ is the foot of the normal from D’ onto Q’I’. Let y, be 
the angle between D’I’ and O'l', so that y+ y, is the angle between 


D'O' and D'T’, i.e. wit =e'/S' +0(s), (39.5) 
where Sv= MT. (39-6) 
Then (39.4) becomes 
3 = QO'’M'(secw —1)+S’ (secy,—1)+v 
= Ri(secy—1) + S'(secy—secy)+usecy, (39-7) 


since Q’M’ = R’+v—M'l’. So far everything is exact. Now » 
and 7, are both O(3), and v = A+ O(10), so that (39.7) yields 


b= A+4RY2+48'(W2—-Y) + O(10) 
= A+ $e"(S’-1— R’-1) + O(10), 


108 HAMILTONIAN OPTICS 
because of (39.2) and (39.5). However, 
S’ = D'l'+O(6) = 1+4u+ O(4), 
whilst R’ = 1+4¢+ O(4). Hence finally 
b = A—4(E — 29) €3? + O(10). (39-8) 


We see that d does not approximate A quite as well as v does. 
v' is of course exactly A, by definition, whilst, according to (37.10), 


ot = R’A—$A? = A+16A4 O(8). (39.9) 
It remains to consider D. From (38.14) 


2D =e (R’'+v)?-R?, 
whence, using (39.3), 


D = A+4¢A+4 O(8) = 6+ O(10). (39.10) 


With regard to the aberration functions ¢ and w, it will be seen 
that one cannot deal with them in the same way as with wv or ot. 
The reason for this is that a wavefront corresponds to rays which 
originate from some object point O. However, the condition that 
they should do so is not easily accommodated when contemplating, 
say, the constancy of T+ U’—U, this being V, but expressed as a 
function of the ‘wrong’ coordinates B’, B. 


40. Remarks on an integral involving A. Circle 
polynomials 


In the diffraction theory of aberrations one is naturally confronted 
with integrals of functions of A, extended over a certain part of Wo. 
Such integrals also occur in the purely geometric theory. A promi- 
nent example is the root-mean-square retardation of the wavefront, 
VQ, say. It is usual to use as coordinates y”, or rather a pair closely 
related to these. (y, is left understood throughout.) ‘They are then 
to be replaced by appropriate polar variables p and @. At any rate, 
granted that the exit pupil is unvignetted and that the inclination 
of the line £’J’ to the axis (Fig. 5.1) is not too large, one is led to an 


integral of the form 1 pen 
Q=7 | | tA? dt dd, (40.1) 
oJ0 


where t = ¢/p, is anormalized radial variable, and A is to be thought 
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of as a function of this and of 0. In place of ,/Q one may take the root- 
mean-square deviation ,/Q, of the retardation from its mean A: 


1 (27 
A=r | |  thatdd; (40.2) 


so that Q, is given by an integral like that in (40.1) but with A 
replaced by A—A. 

In the geometrical theory Q or Q, is often referred to as a merit 
function or performance number, and the practical problem is to 
minimize it with respect to the parameters defining the constitution 
of the system K. There appears to be no compelling theoretical 
foundation for such a procedure. Rather, one is just formulating the 
intuitional notion that provided QO can be made sufficiently small, 
e’ will be sufficiently small, and therefore the performance of K 
adequate. This is certainly true, but the rub lies in the phrase 
‘sufficiently small’. However, Q, also appears, this time quite 
naturally, in the theory of diffraction-limited systems, i.e. systems 


such that 
2m z= A, (40.3) 


where A is the wavelength of the light forming the image. In fact, 
1 —(27/A)* Q, is then the ratio of the intensity of light at the point 
where this is a maximum, i.e. at the diffraction focus, to what it 
would be at J’ if aberrations were absent. This ratio bears the name 
Strehl intensity. 

At any rate, it is obvious that the usual form of A as a power series 
in ¢, with coefficients which are polynomials in cos 0, leads to a very 
unwieldy integration in the context of (40.1). To begin with, the 
integration over @ will be greatly simplified if the various powers of 
cos @ just referred to be replaced by cosm0(m = 1, 2,...), bearing 
in mind that 

be O(m, + ms) 
{ cosm,@cosm,0d0 =< m(m, = mz + 0) (40.4) 
: 27 (m, = mM, = 0). 
The factor multiplying cos m@ is a power series in ¢; and here again 


great simplicity will be attained with regard to the integration over t 
if this factor be expressed in terms of orthogonal polynomials 
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Run(#), i.e. polynomials which are such that 


: _ fo (m+n) 
| . tRi m(2) Rum{t) dt = ‘ (n, = n), (40.5) 
where the 7,, are positive numbers still at our disposal. In the light 
of subsequent experience the choice 


T = t/[2(n+1)] (40.6) 


is a good one, in that it causes the coefficients of R,,,,, to be integral. 
Evidently, granted that the highest power of ¢ in R,,, 1s to be n, 
the sign of ¢” is not yet fixed, and we require it to be positive. We 
now write Pa 

A= ¥% DY AnmRnn(t) cosm#, (40.7) 

n=0 m=0 

where the prime is intended to indicate that when writing this 
series out in full every A,,, is to be replaced by 2-4A.,. (The reason 
for doing so is evident from (40.4).) The virtue of the form (40.7) 
of A is now obvious from the simplicity of the result 


Q= > & Alm/(n-+ 1). (40.8) 


(Here Ano of course really stands for A,,.) A is very easily evaluated 
if one remembers that Ry) = 1, in view of 7) having the value 3. 


a A = 2-4Ago, (40.9) 


and then Q, is given by (40.8) with the term which has n = m=0 
omitted. 

We desire to continue in an entirely elementary way, and show 
how the R,,,, may be found explicitly by means of a step-by-step 
method. To begin with, from a formal point of view any point given 
by #, @ is also given by —?,@+7. It follows that R,,,, must be even 
or odd according as m is even or odd. Since the highest power of 
tin Rim Was to be 2, it follows that 


Rin(t) = 0 (n—m odd), (40.10) 


and many terms of (40.8) are in fact absent as a consequence. More- 
over, cos appears everywhere only on account of its occurrence in 
y, where it has a factor ¢. Hence cos” @ and therefore cosm@ must 
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have ¢” as a factor, i.e. the power of the term of lowest degree in 
Rim is m. Consequently R,,,, = a)t”, and (40.5) shows that dy) = 1: 


R= (40.11) 


Now, writing Rn. = at" +6,t"*, take m = n—2, and alterna- 
tively n, = n and n—2 in (40.5). Using (40.11) one gets two con- 
ditions to determine a, and b,, and in this way we get 


Riana = mt" —(n—1) te. (40.12) 


Again, given (40.11) and (40.12), the relation (40.5) with m = n—4 
and 1, = ,n—2,n—4 in turn allows one to calculate the three 
coefficients of R, 4. We thus find that 


Ra, na = $n (n—1)t" —(n—1) (n—2) t* 74 H(n—2) (n—3) "+. 
(40.13) 
It should be clear by now how we can successively obtain 
TR 63 Ria oe 


in this fashion. The explicit results above already give all the R,,, 
for n and m not exceeding 6, with the sole exception of Rgg and 
this turns out to be 20f6 — 30f4+ 12#2—1. We therefore have the 
following table: 


TABLE 5.1 


mo i 2 3 4 5 6 

o1 . 2-1 : 6t* — 6? +1 : 202° — 3024+ 1227—1 
I t : 30 —at 10t®— 128° +- 32 : 

2 hn k 2 ; 4t*— 30 : 151° — 20f4+ 622 
Zoe ; Fa . 5e— 48 F 

4... ‘ : t’ Fi 61° — 5z4 

5 a ‘ 

6 £8 


The polynomials R,,,,(t) are the so-called Zernike circle polynomials. 
We have seen how they can be constructed in an elementary way; 
but they are in fact merely a special example from the theory of 
classical orthogonal polynomials. Explicitly, they are, in effect, 
special Jacobi polynomials; so that recursion relations, generating 
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functions, and the like are easily written down. However, we have 
no need here to concern ourselves with these. 

If we restore the explicit dependence of A upon h’, A,,,, becomes 
a power series in h’: 


Ayn zai > Aight (40.14) 
t=0 


Since R,,,, is of degree n in t, any coefficient A,,,,,; may be thought of 
as relating to an aberration of ‘order’ n+m+2]—1; but, in fact, 
the polynomial character of R,,,,, entails that such an aberration is 
a mixture of aberrations of order equal to and less than that just 
given, if the term ‘order’ is now understood in the sense ascribed 
to it hitherto. With regard to the relations between the old and the 
new aberration coefficients it will suffice to consider those of the 
third order. A is then given by the right-hand member of (19.1), 
together with (19.4) and p = pot. Picking out the relevant terms 
from (40.7) and (40.14), direct comparison gives, with (19.6), 


Argo = (sea 2) 01, Asiy= 40302, Assy = 4P6Fs 
Ago, = (44/2) 03(203+ 0%), Ary = Pots: (40.15) 


At any rate, the ‘wavefront aberrations’ are preferably to be taken 
as governed by the aberration coefficients A,,,,;, on account of the 
relatively transparent role these play in diffraction theory. The 
various ‘types’ of aberrations evidently differ to some extent from 
those studied earlier, at least with regard to the details of the dis- 
placements which they induce. 


C. The dependence of the aberrations on the positions 
of the object and of the stop 


41. Stop shift and point characteristic 


A shift of the stop causes the axial point EZ’ of the plane of the paraxial 
exit pupil to move through a certain distance, %’ say. The customary 
interpretation of the aberrations on the basis of V requires the 
posterior base-plane & to coincide with &’, so that when the latter 
is moved, V has to be recalculated for the new plane of the exit pupil 
&’, say. This problem has, however, already been fully considered 
in Section 35. Accordingly, equations such as (35.13) and (35.14) 
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give exactly the new (characteristic) coefficients in terms of 
the old. 

With regard to the effective coefficients one must not forget that 
d’ = 1—%' = cis not equal to unity, see (23.4). Accordingly, one 


has, f t 
Bey TOR INBEANOS Be — go = 4tpy = cpt, (41-1) 
and so on. Again, recalling (23.4), 
sf = 6(s,+p1), Sf = 6(c-%5, + ¢)y); 
from which one deduces, by means of (41.1) and the first member of 
(35-14), that sf = & (of + 39p2?). (41.2) 


Of course it is not at all necessary to proceed via the characteristic 
coefficients to obtain (41.2). 

The ubiquitous appearance of powers of c is a nuisance. If 
desired it can be prevented in the following way. Let a ray intersect 
&' in D, as usual. Lines drawn through D’ and I’ on the one hand, 
and through E’ and I’ on the other, intersect &” in points D; and D,,, 
respectively. Denote the differences between the coordinates of 
Dj; and D, by 7’. Then we may specify the ray in question by j’ 
rather than 7’. From elementary considerations it follows that 

y= cf’. (41.3) 
Provided equations such as (35.13) be now interpreted as relating to 
the new coordinates—one should write J, instead of p,, and so on— 
all the powers of c are to be omitted from them. This procedure en- 
sures that, to a sufficient approximation, equal maximum values of 
the usual polar variables, p and / say, correspond to the same 
angular aperture of the respective pencils of rays admitted by K. 

We note in passing that the quantity c-?(2p,—,) has a fixed 
value for all positions of the stop; a result which illuminates the 
definition of o, in (19.6), and the generalization of which will 
occupy us in Sections 46 ff. 

One would surely like also to get rid of terms such as that in (41.2) 
which is quadratic in p~. There is no very easy way of doing this 
here. It may, however, be mentioned that they can be eliminated, 
provided one goes over from f’ to another pair of variables in the 
object space, e.g. the coordinates of the point of intersection of the 
ray with the plane of the entrance pupil. There is no need to enter 


8 BIT 


II4 HAMILTONIAN OPTICS 


into the details of this process here, especially as the problem of 
stop shifts will be considered afresh in Section 44 in the context 
of the angle characteristic. 


42. Object shift and point characteristic 


Let the object plane # be shifted to a new position through a dis- 
tance %. The new object plane is naturally denoted by ¥, and the 
ideal image plane conjugate to it by ¥’. The distance between E’— 
which has remained fixed—and 0} is d’. The magnification associ- 
ated with % and .7’ is m, whereas that associated with & and &”’, 
i.e. 5, is unaltered. Essentially we now need to repeat the work of 
Section 35, in the sense that we require the explicit dependence of 
the point characteristic upon x. Unfortunately this is a more in- 
volved problem than the earlier one, mainly on account of the non- 
constancy of m. We therefore consider it only in the barest outline, 
since the effects of object shift will be treated in some detail later on 
on the basis of the angle characteristic. 

Independently of the variables on which the old and new charac- 
teristics depend, we may write 


V = V—2a, (42.1) 
which may be compared with (35.3). This gives the equation cor- 
responding to (35.4), 1.e. 

o(%; £9, £2) = o(, 9, €) + (d+. a)t—(d? +.u)t —8/a 
+(1/2mf) €—(1/amf)6+2-d' +d’. (42.2) 
All non-linear terms of g and ? have been absorbed in v and 0 
respectively, the linear terms having coefficients quoted just after 
equation (15.4). Also, in view of (14.27), (14.28), 
d=(s—m)f, d’=(s—m)f, &=d—d = [(1/m)—(1/m)]f. 
(42.3) 


It remains to express £, 7, ¢ on the right of (42.2) through E,9, €. 
To this end one uses the equation 


B = —moV/oy, . 
= —m(d"+u)4(y'—y,)+yilf—m(y'v,+2y10,), (42.4) 
andthence1/xand = $, = (m/m) y, + MRB/a (42.5) 
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are obtained. The whole process is rather cumbersome, and we 
pursue it no further. In any event, one may deal directly with the 
effective coefficients; but then it must be borne in mind that e’ and 
é’ now do not denote the same displacement merely regarded as 
functions of alternative sets of variables. 


43- On a linear substitution involving direction cosines 


This section deals with a simple algebraic problem which arises in 
various contexts. We first bring back to mind the rotational invari- 
ants £,7, € which occupy so prominent a place in the context of the 
angle characteristic. They are linear functions of the basic invari- 
ants £, 7, €, according to equations (15.11) and (15.12). The follow- 
ing question now arises. Suppose we make a linear substitution with 
constant coefficients 


B>aB+58', B’>cB+dp’, (43-1) 
arbitrary except for the condition 
g = be —ad +0. (43.2) 


Then, what is the effect of (43.1) on &, 9, ¢? 
If we momentarily write a, = @—s¢, b, =b—sd, c, = @—mié, 
d, = b—md we find easily that, as a consequence of (43.1), 
§ > b3E + 24,67 + ai, 
9 > by dE + (aya, +b) 7 +4,e6, (43-3) 
> di+ed,7 +26. 
On the right 2, 7, € may be eliminated in favour of £,7, € by means 


of (15.12). Denoting the resulting expressions by £, 9, € we have 
the desired result 


E> £ = @E—2dby + BY, 
> = a — (dd + b2)n + ba, (43-4) 
6+ = @€—2tdy + a%, 


where 
—m)— [(ma +b) —s(mé+d)], 


(s 

(s—m)-1 [(s@+ 5) —s(sé+d)], 
(s—m)-1 [(ma+b) —m(mé+d)], 
(s—m)- [(sa+b) —m(sé+d)]; 


d 
6b 
, (43-5) 
c 
d 


8-2 
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and b¢—dd = — g, A=’, (43.6) 
where A is the discriminant of (43.4). The inverse of (43.4) is 
E = g-*(d2— 26d +626), 
n = g [ed — (be + dd) f+ ab), (43-7) 
6 = g 2 (CE — 2d¢0f + a8). 


44. Stop shift and angle characteristic. 


When the stop is shifted, the pupil planes will move, and the mag- 
nification associated with them will change from s to §, say. The 
base-points being taken at O) and Op as usual, the position of the 
stop is therefore contained in 7 only through the constant s. We 
again absorb all non-linear terms of g(¢) in the aberration function, 
so that according to (15.13) and (14.35) 


T = 25+ Clam+t, (44.1) 


the focal length f having been taken as unity. (Note that f is a con- 
stant of the system, unlike d’.) Since T, regarded as an optical dis- 
tance, is independent of the position of the stop, we thus have the 


identity Elam+tE, 9, €) = C/am+ tE, 4,6), (44.2) 


where £, 7, C refer to s, and £, 9, to $. We can use the results of the 
preceding section, for the change from the first to the second set 
of invariants corresponds to choosing the constants in (43.1) in 
such a way that B—sR’ > B—$P’ whilst @—m’ remains fixed. 
This requires that 


a= I, b = mq/(1 —4); €¢=0, d = 1/(1—4), (44.3) 

where gq = (§—s)/(S—m). (44.4) 
Then, from (43.5), 

d= -1/(t-g), 6=-gl(1-g, @=0, d=1. (44.5) 


In (35-10) g denoted the quantity — %”/(d’ — X’). Now x’ = d’ ag? 
and so, from (14.27), g = —(s—$)/(§—m), which is identical with 
the right-hand member of (44.4). It will be convenient to retain 
also the previous abbreviation c = 1 —g. 
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The relations (43.7) here become, since g? = (1 —q)~*, 
E=c€+acqh+@e, y=ch+gl, C= ¢. (44.6) 
This is formally the same as (35.12), provided all terms not quad- 
ratic in the coordinates are omitted from the latter. ‘To this extent 


one therefore has a much simpler situation here. 
Since £ = €, (44.2) reduces to 

LEME ng rl = DUNE eel, (447) 
and it only remains to insert (44.6) on the right. We use the usual 
notation for the coefficients of low orders, e.g. @ = p,&? + pofy+..., 
so that incidentally the initial factor (s—m)~! in (19.24) is to be 
omitted here. For the primary coefficients we then get exactly the 
equations (35.13) again. Then, for instance, 


BT = (§—m) p, = AS—m)p, = Cp, 
consistently with (41.4). The equations for the secondary co- 
efficients are 


§, = c8s,, 

Sy, = €°(S2+ 693), 

§3 = C*(S3+ 952+ 39751), 

$4 = c*(5q+ 495.4 12q"5,), 

$5 = c3(85 + 2q54+ 4983 + 6975+ 12q°s,), 

Sg = C7(S¢+ 955+ G?5q + 29°53 + 29°52 + 39°51); 

Sy = 07(87 + 2954+ 49°52 + 8q°51), 

Sg = €7(5g+ 3957 + 2955 + 59°54 + 49°53 + 8q?5q + 129451), 

Sy = C(Sg + 295g + 39°57 + 2956 + 39°55 + 49°54 + 49°53 + 59'S, + 69°5)), 


$19 = Sto + 959+ 975g + P57 + 9756 + G55 + G54 + 9483+ G5q+ GS). 

(44.8) 
These are exact; and to derive those relating to seventh and higher 
orders is evidently a matter of little difficulty. We give the results 
for the first eleven coefficients of order 2m — 1, writing these simply 
as k,,...,,,, taken in the usual order. This means that 


> ky, & = bu(w+t) +v+4, (44-9) 
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for all 2,4 and v such that vy < 4 <n. Then, absorbing a factor 
c-tnte+ in k, (cf. the discussion following equation (41.2)) one 
finds that 
ky =h,, 
hy = ky + 2nghy, 
hy = hg + gk, + nq, 
hy = ky+2(n—1) qkg+2n(n—1) Qh, 
hs = kg +2qR,+(n—1)(2qkg+ 3922) + 2n(n — 1) @Phy, 
he = ke + ghgt q?hyt (a — 1) (q?hg + Ge) + $n(n — 1) g*hy, 
hy = ky +2(n—2) kg + 2(n— 1) (n—2) gh, + gn(n— 1) (n—2) GPR, 
hg = ket gh, + (n—2) (2qh5 + 5q?Rs) + 2(m— 1) (n— 2) (G?hg + 29°hp) 
+ 2n(n—-1)(n—2) q*ky, 
hy = hy + 2ghkg+ 39q7hy + (m—2) (2gke + 397s + 49°hy) 
+(n—1)(n—2) (29g°kg + 3q°hp) + (nm — 1) (n—2) hy, 
Ryo = Ryo + ghey + Get Phy + (0 — 2) (G?Re + Rs + ga) 
+4(n— 1) (m—2) (q*hg + q?hy) + gn(n— 1) (n—2) gh,, 
Ry = Ry + 2(m — 3) ghy + 2(m— 2) (m — 3) g?hg + $(n — 1) (n—2) 
x (n—3) hy + §n(n — 1) (n—2)(n—3) g*hy. (44.10) 
The relations (44.9) are of course contained in the first ten of these. 
In Section 41 it was noted that c~?(2p, — ,) has a fixed value for 
all positions of the stop. From (44.10) we now see by inspection 
that 2(n—1)k,—k, = 2(n—1)k,—k,. Restoring the appropriate 
power of c this means that the quantity 
Fan—1,1 = 2(n—2)!(s—m) [2(n—1)Rg—Rg] (44-11) 


has a value independent of s; so that this is a generalization to all 
orders of the result previously obtained (the factor 2(m—1)! has 
been inserted for later convenience). The coefficients which occur 
in (44.11) govern oblique spherical aberration of order 2n—1. 
We shall have occasion to reconsider such ‘invariant’ expressions 
later on in full generality. 

It certainly appears at first sight as if the theory of the effects 
of stop shifts is much simpler here than the corresponding develop- 
ments of Section 41. This simplicity is, alas, illusory, to the extent 
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that all the complexities of detail reappear as soon as one enquires 
into the effects of stop shifts on the effective coefficients; for these 
are, after all, what one is ultimately interested in, at any rate in the 
geometrical theory. As a matter of fact, quite a lengthy calculation 
is required merely to confirm that (41.2) can be reproduced on the 
basis of (35.13), (44.8) and the first member of (24.13). In short, by 
using the angle characteristic one sweeps all the nasty details under 
the carpet; and one does well to keep this constantly in mind when 
going through the otherwise elegant results which we are about to 
derive. 


45- Object shift and angle characteristic 


We now go on to investigate the effect of shifting the object plane 
through a distance &, the conjugate image plane shifting through a 
corresponding distance %’. It will emerge that the theory is here 
genuinely simpler than that of Section 42. We have, to begin with, 
T = T-Ra+ 80’, (45.1) 
with R= —(m—m)/mm, & =—(m-m). (45.2) 
As usual all non-linear terms of g are to be absorbed in the aber- 
ration function, and then (45.1) becomes 
(E, 9, £) = 2(8, 0, 6) —(€/2m) + (Clam) + [(e — m)/mmh] [(1 — €)# 1] 
~ (mh —m) [(t-£)8-1]. (45-3) 
We proceed exactly as in the context of stop shifts and determine 
the constants in (43.1) in such a way that B—m®' > B— mpg’ 
whilst 8 — sR’ remains fixed. This requires that 
@=1, b= sp(i—p), &=09, d= 1/(1—p), (45-4) 
with p = —(h—m)/(s—m). (45-5) 
Then from (43.5) 
G=—1, b=0, @=pl(t—p), d=1/(1-p), (45.6) 
whence, writing b = 1—p, 
— E=E g=pl+op, C=p+abpprhe. — (45.7) 
Equations (45.4)-(45-7) evidently run entirely parallel to (44.3)- 


(44.6). On the right of (45.3) we also have & and € explicitly, but 
these may be expressed in terms of £,9, € by means of (15.3), i.e. 


E = (s—m)*(E—29+€), € = (s—m)-* (mh — 2s + 58). (45.8) 
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The constant terms, and the terms linear in the rotational invariants 
correctly cancel out in (45.3). The quadratic terms then give the 
following relations. 


Pr = Pit Phot Pst Pst PDs +P Pe tsi(1 —m*/m), 
Bo = O(po+ 2ppst 2Ph4+ 3P7Ps + 4P*P6) — 44i(1 —sm*/m), 
ps = (pat pis + 2p"Pe) + 2j1 (1 —s°m/m), 


ms ° A (45 9) 
Da = P(pyt 2pP534+ 40") + 4fs(1 —S°m/m), 
Ds = B(ps+ 4PP6) — 4ji(1 —S*/m), 
Be = Ope tila —s4/mm), 
where ji = ¥(m—m)(s—m), (45.10) 


and #@ = (m—sp)/b, in view of (45.5). 

The equations (45.9) already suffice to show that one cannot have 
more than one distinct pair of perfect planes. To see this most 
simply, take s = 0, since the actual value of s is clearly irrelevant. 
Then the equations p, = p, = 0(a + 6) are obviously mutually 
inconsistent. 

The $, now all involve p,, whereas in the context of stop shifts 
the p, (a < 6) did not. Quite generally, in order 27 —1, 


hk, [a +4(n+1)(n+2)] 


requires that we know the value of ) when the object is shifted, but 
not when the stop is shifted. We also see that primary spherical 
aberration can vanish for at most four positions of the object. If 
it vanishes for some given position, then it will vanish for neigh- 
bouring positions obtained from the first member of (45.9) by 
taking p to be infinitesimal. This gives 


Po = 3(s—m) > (1 —m"), (45-11) 


that is to say, p, can vanish together with p, only if the coefficient 
of primary coma has a definite value depending upon the values of 
the magnifications s and m alone. 

(45.11) constitutes Hockin’s condition (also known as Herschel’s 
condition) in the primary domain. It is easily extended to all orders. 
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If we reject all terms not linear in p, and at the same time set 
4 = € = o(since at present we are only interested in those terms of 
T which depend on £ alone) we have to put 


E>£, y>pl, S>0, E+(s—m)%, F-> m4s—m)% 
on the right of (45.3). Writing 


t(E, 9,6) = S()+9C(E) +... (45-12) 


in analogy with (26.1), we find 
EC(E) = m~*[(s—m)?—mg}* — [(s—m)? — E}* 
+(s—m)(1—1]m?). (45.13) 
If we recall that C(£) = p.£+5,€%+... we see at once that (45.11) 
is correctly contained in (45.13). Evidently spherical aberration and 
circular coma can vanish together for neighbouring positions of 4% 
only if m? = 1. 

From (45.9) one deduces very easily that 2); — 6, = 67(2p3 — p,) so 
that the quantity (s — m)* (2p, — p,) has a fixed value for all positions 
of the object. Recalling a previous result, it thus has the same value 
for all stop and object positions; a state of affairs which we shall 
shortly consider in greater generality. 

The secondary equations corresponding to (45.9) are readily 
written down. We do not do so here since they will be contained in 
the more general results of the next section. 


46. Joint shifts of stop and object 


It is of considerable interest to examine the consequences of chang- 
ing the positions of the object and of the stop simultaneously by 
arbitrary amounts. Our previous results will then be merely special 
cases of those about to be derived; but they are valuable for general 
orientation if nothing else. 

The constants in (43.1) now have to have values such that 


B-—sB’ >B-SB’ and B-—mp’> B-mMG’. 


This means taking 


a=1, b= 
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The definitions (44.4) and (45.5) of g and p may appropriately be 
retained here, and then 


7 — Ptmg—(s+m) pg 
(1—p)(1-9) 


a= 1, €é=0, d= 


From these we infer in turn that 
@=-1/(1-¢), 6=—g/(1-g, @=pl(1-p), 4=1/(1-p). 
(46.2) 


The definitions of the constants b and c may be somewhat general- 
ized to 


1—p I—-q 
b = C= 7 6. 
= =n (46.3) 
and then E = £4 2bcqh t+ bE, 
n = cpl + be(1 +9) 9+ 596, (46.4) 


C= cpt + abeph + BE. 


Even the appearance of the constants 5 and c could be prevented 
here by ‘rescaling’ p and h’ each time s or m or both are changed; 
that is to say, by understanding p to stand for cp and h’ for bh’ (cf. 
the discussion following upon equation (41.2)). However, one is 
then apt to forget that p, for instance, is not the actual radial co- 
ordinate in whatever exit pupil one has in hand, but only propor- 
tional to it. We therefore desist from employing this device in the 
present context. 

The identity (45.3) remains as it stands, but we now have to use 
(46.4) in place of (45.7), whilst in (45.8) s has to be replaced by S. 
The equations (45.9) then have the following generalized counter- 
parts: 


By = CA(Py + Pot P*Pst+P*hat P's + P*Pe) +4i(t — me /m), 


Pp. = be*[4qp, + (1+ 329) Pot 2P(1 + Pg) Pst 2pP(1 +P) Pa 
+ p°(3 +9) Ps + 4P*P6] — 4)1(1 — $mn?/m), 


Bs = b?c*[2q7p, + G1 +Pq) Pot (1+P74") Ps + 2pgPat PU + Pd) Ps 
+ 2p"pe] + 2j3(1 — Ssh /m), 
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By = Bc?[4q"p, + 29(1 +29) Po + 4P9P3+ (1 +9)" Pa 
+ 2p(1 + Pq) Ps + 4P7b6] + 441(1 — 82m /m), 


Bs = 5%c[4q°p, + 9°(3 +29) Pot 29(1 +9) Pst 29g(1 + Pq) Pa 
+ (1+ 309) Ps + 4PP¢] — 431(1 — 8/m), 


Bs = OXG'P1+ Poot Pbst+ Phat Ps + Pe) tilt —St/mm), (46.5) 
where now Ji = ¥(mM —m) (S—mM)*. (46.6) 


The combination 26,—, is again proportional to 2p,—p,, the 
factor of proportionality being 


[(1—P) (1-9) (1 — 9)? = (sm)? (S— ma). 
It follows that (s —m)? (2p, —p,) = const., (46.7) 


i.e. the expression on the left has a value entirely independent of s 
and m; as was shown previously. 

Though rather lengthy, the secondary relations will now be 
exhibited in full. With 


ja = Te(m—m) (S—m), (46.8) 
they are 


$1 = c8(s, + psy + ps3 + p75_ + ps5 + p4sg + p3s_ + ptsg + Psy + P8549) 
+jo(1 —m*/m), 


Sy = be*[6gs, + (1 + 59) 82+ 2p(1 +29) 53 + 2P(1 + 29) Sy 
+ 3p°(1 + Pq) 55+ 2P7(2 + Pg) 56+ 3P°(1 + Pq) Sy 
+ 2p9(2 +g) 83+ P(5 + Pq) 59+ Op°S19] — 6fo(1 — S4/m), 


S, = b*c4[3q%sy + g(1 + 29) 5g + (1 + 2p7Q*) 53+ Pq (2 +19) 54 
+p(i + pqt pq’) 55 + p(2 + pq’) 56 + 3P7q5_ + p(1 + 2g) 5g 
+ p?(2 +P) 59+ 3P*519] + 37o(1 — 8°m?/m), 


$4 = Bct[12g"s, + 4q(1 + 29) 52+ 4p9(2 + Pq) S3+ (1 +9) (1+ 5P9)54 
+ 2p(t + 4pq+ p?q") 55+ 4p°(1 + 2Pq) 54+ 3P(1 + pg)" sy 
+p(1 +9) (5 + Pq) 53+ 4P°(2 +09) 59 + 12p*549] 
+ 12j(1 — $203/m), 
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§5 = bec*[129%s, + 6g°(1 + pq) 52+ 49(1 + pq+p7q?) 53+ 29(1 +409 
+p?q?) sq + (1+ pq) (1+ 409+ P79") 55+ 4P(1 +09 + P°9") 56 
+ 6pq(1 + pg) 8, + 2p(1 +49 +p"q?) 5g + Op*(1 + Pq) Sy 
+ 12P%539] — 12j,(1 — S/m), 
Sq = b8c*[3q4s, + G°(2 + Pg) 52+ (2+ P7g") Sgt GI + 2pq) 54 
+9(1 + pq+ pq’) 85+ (1 + 2p"9") 56+ 3P9q?s, + Pg (2+P9) Sg 
+P(I + 2Pq) 59+ 3P519] + 3Jo(1 — S4m/m), 
$8, = b%c*[8q°s, + 49°(1 + Pq) 52+ 8p9q"5q + 2q(1 + pq)? 54 
+4pq( + pq) 55 + 8p?qs_+ (1 +pg)* s+ 2p(1 + pq)’ se 
+ 4p(I + pq) 89 + 8p*519] — 8f0(1 — 9mn?/m), 
Sg = béc?[1294s, + 49°(2 + pg) 52+ 49°(1 + 269) 53+ G71 + pg) 
x (5 +p) 54+ 2q(1 + 409+ p7q") 55+ 409 (2+P9) 56 
+ 3Q(1 + pq)" 57+ (1 +9) (1+ 529) 83+ 4P(1 + 2P9) 89 
+ 12p*s19] + 12j.(1 — S4/m), 
Sy = b°c[6q°s, + g4(5 + Pq) 82+ 29°(2 + Pg) 53+ 2q°(2+D9) 54 
+ 39°(1 +9) 85+ 29(1 + 209) 86+ 39" (1 +9) 57 
+ 2g(I + 2pq) 83+ (1 + 5Dq) 59+ O51] — Go(1 —$°/m), 
S19 = 59(g°s1 + 9752+ G53 + G54 + 9755+ 956+ G57 + 9?Sp 
+ G59 + S39) +Jo(t — $°/mm). (46.9) 
The ratio of the last terms in the third and fourth of these equations 
is a number, independent of s and m, and the same is true of the fifth 
and seventh, and the sixth and eighth equations. It is therefore 


natural to consider the combinations 45,—$,, 25;—357, 45¢—g- 
One finds that 


483— 84 = (1—9)*(1 —p)? (1 — pg) [(453 — 5a) | 
+ p(255 — 357) + p7(456—5Sg)], 
25, — 38, = (1-9)? (1 —p)* (1 — Pg) * [24(453 — 54) 
+(1+pq) (255 — 357) + 2P(456 —58)], 
455-83 = (1-9)? (1 —p)* (1 — 99) * [97(453 — 5a) 
+ 9(255 — 357) +(456—Ss)]. (46.10) 


This result means that under any change of s and m 453 — 54, 255 — 357, 
45, —Sg transform amongst themselves, independently of all other 
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coefficients; so that if all three of these combinations vanish for 
some particular value of s and of m they will do so for all values of s 
and m. In particular, (s—m)*(4s,—5,) is independent of s (cf. 
(44.11)), whilst (s—m)* (4s, —5,) is independent of m. 

Conclusions of the kind we have just drawn are of sufficient 
interest to make us seek some simple general method which will 
allow us to arrive at similar results governing the coefficients of 
any order; and this we now proceed to develop. 


D. Invariant and semi-invariant aberrations 
47. The focal angle characteristic 


The angle characteristic T which entered into the considerations of 
the three preceding sections was taken with respect to the base-points 
O, and O9. Whereas from a formal viewpoint the dependence of 
T on s was inessential, in the sense that s enters into T only through 
the particular rotational invariants which happen to be chosen, its 
dependence on m is essential, for a change of m entails a change of 
T, no matter how one chooses the rotational invariants. This means 
that the 2%) are necessarily dependent upon m, that is to say, they 
are not ‘absolute’ constants of the system. For our present purposes 
it is therefore convenient to write T as the sum of two parts, of 
which the first is the angle characteristic referred to base-points 
which are, as it were, inherent in the structure of K. The principal 
foci F and F’ suggest themselves, and we adopt them. The angle 
characteristic Tso defined is called the focal angle characteristic. 
Paraxially T'is remarkably simple: 


Tt = —9, (47-1) 
leaving aside a trivial additive constant. The value unity has been 
attached to f as usual. The coefficients #{) are now absolute con- 
stants of K, i.e. they involve neither s nor m. If Oj and O, lie at 
distances x’ and x to the right of F’ and F respectively, x’ is given by 
(14.27) with s = 0, and likewise — x is given by (14.28) with 1/s = o. 
It follows that / ns eee (47.2) 


Here the various functions may still be taken to depend upon vari- 
ables of our choice, and we adopt (47.2) in the form 


T(E, 9,6) = TE, 7,6) —m(1—2)t-m-*(1—f)k. (47.3) 


126 HAMILTONIAN OPTICS 


We also write this occasionally as 

T= T+D. (47.4) 
Here, then, the right-hand member does not involve s at all, whilst 
m enters into it only through D. In principle we could now simply 
express £,77,¢ in terms of &,7,¢ on the right of (47.4), and so, in 
effect, recover our previous equations such as (44.8) or (45.9), for 
example, though differently expressed. Our present objective, 
however, is to attain, as quickly and systematically as we can, re- 
sults of the kind represented by (44.11) or (46.7); and we shall see 
that this can be done without having first to write down the equa- 
tions giving the explicit dependence of the aberration coefficients 
upon s and m. . 


48. The idea of invariant and semi-invariant aberrations 


Before attempting to derive specific results, let us become quite 
clear about what we are trying to do. To this end, consider the 
dependence of the primary aberration coefficients upon s, as given 
by (35-13). The latter represents a linear transformation of these 
coefficients, from the p, to the ,; and one can write down six com- 
binations of coefficients (i.e. as many as there are coefficients), 
the values of which are independent of s. A simple example is the 
expression (s—m)*(4p,+.); for, since c+q = 1, 


4p, + Pz = (401+ Po). (48.1) 
Bearing in mind that c = (s—m)/($—m) it follows that 
($—m)* (4p. + Ba) = (s—m)* (401+ Po), (48.2) 


or, in other words, (s—m)°(4p,+p,) has the same value for every 
position of the stop. One might therefore say that this expression 
is invariant under stop-shifts, or ‘s-invariant’; and this has indeed 
sometimes been done. However, there is not much point in such a 
terminology here, because it reflects nothing but the content of 
the first two members of (35.13). In any event, this kind of ‘s- 
invariance’ has no useful significance in the context of the displace- 
ment. Thus, with regard to our example, the meridional partial 
displacement is (s—m)(4p,p°+ 3p.p"h’); and (48.2) has evidently 
no direct useful implications. The situation would be otherwise if 
p,and p, multiplied the same powers of p and h’,i.e. jointly governed 
a particular type of aberration, as do p, and 9,. In that case the partial 
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displacement could be subdivided into two parts, one of which would 
have the property that if it vanished for one value of s, then it would 
do so for all values of s, independently of p and h’. It is also charac- 
teristic of the state of affairs just described that results of the kind 
(48.2) are coordinate-dependent, in the sense that, upon a mere 
change of scale of one coordinate, the individual coefficients will 
have to be supplied with further numerical factors. 

The upshot of the preceding remarks is that we should princi- 
pally concern ourselves only with combinations of coefficients 
governing aberrations of a given order and type. Any such com- 
bination will have the generic form 

of = Daly ee), (48.3) 
where the a) are purely numerical coefficients, and the sum 
goes over values of m# and v such that “#+v = const. = A, say. 
If there exists an integer r such that (s—m)' w™ is independent 
of s and m, then we call this expression an (absolute) znvariant, or 
also an invariant aberration (the factor &"-"y#~¢” then being left 
understood); when it depends only upon s or m, but not upon both 
together, we call it a semt-invariant (aberration). More precisely, 
when it is independent of s only it is an s-znvariant, and when it is 
independent of m only it is an m-znvariant. Thus, recalling (46.7) and 
(44.11), (s—m)*?(2p3—p,) is a third-order absolute invariant, and 
(s—m)y—- [2(n—1)k, — Ry] is an s-invariant of order 27 — 1, where 
n > 2. Our next task is to enumerate, and as far as possible exhibit 
explicitly, all the invariants and semi-invariants of the various orders. 


49. Generators of invariants and semi-invariants 


In consequence of the linear relations (15.11) and (15.12) any 
differentiation with respect to the unbarred variables can at once 
be replaced by differentiation with respect to the barred variables 
and vice versa. Let us put 


a =e é Q 
ay Pee ase ee ia 2 
S = (s—m) BE S= ay ae (49.1) 
7 —. @ 7) 0 
= _ aa => — — 2 ; 
M = (s—m) Ya M ae ad 38° (49.2) 


A = (s—m}( see 2,9) A= oe eczh ( ) 
> 4 EC ap)? Fog ae aR 49-3 
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Then it is an elementary exercise to confirm that 
S=S, M=M, A=A. (49.4) 


For reasons which will become clear shortly, we call the differential 
operators S, M and A generators of invariants (semi-invariants here 
being included in thisterm). S, M, A are effectively the same genera- 
tors, though differently expressed, as we see from (49.4). We shall 
say that S and S,M and M, A and A are adjoint to each other; and 
this terminology will be extended to products of powers of these 
operators. 
We now take (47.4) order by order, and write 


£m) — Fr) +. qin), (49.5) 


quantities with a superscript (7) relating to order 2” —1, as usual. 
However, exceptionally we shall speak of them as being of degree n, 
rather than 27, thus thinking in terms of the rotational invariants 
only. This is appropriate in the present context, since differentia- 
tion with respect to the ray-coordinates nowhere occurs. 

Let us then consider first the action of some power of the genera- 
tor A on &”, Clearly A*%™ is a polynomial of degree n — 2a, so that 
for no choice of a will A*#™ be a constant unless 7 is even. When 
n is even (= 2N, say (N = 1, 2,...)), take a = N. Then, on the left 
of (49.5), AN#2%)is (s —m)? times a linear combination of aberration 
coefficients of order 4N —1 which govern a particular type of aber- 
ration. On the right of (49.5) we apply the adjoint operator A. 
A by itself completely annihilates D in any event, whereas ANE™ 
reduces to a constant which, as we know, does not depend upon s or 
m. Since all relations (49.1)-(49.5) are identities by nature, it 


follows that the constant 
Ogy = ANTON) (49.6) 


zs an absolute invariant of order 4N—1 (N = 1,2,...). 

Next, we come to the semi-invariants, and we take the s-invari- 
ants first. Inspection shows that A*S” +2 is a constant, where a 
is a positive integer or zero, such that »—2a > o. Moreover, apart 
from a common factor (s —m)*—, this constant is a linear combina- 
tion (with purely numerical coefficients) of aberration coefficients 
governing a particular type. On the other hand, since neither A nor 
S nor d™ involves s, the application of the adjoint operator A*.S"—2« 
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to the right-hand member of (49.5) yields a constant independent 
of s. Thus the constants Ax Sn—2a4{n) (49.7) 


2n—1,a 

are s-invariants, where % = 0,1,...,42 when n is even or $(7—1) 
when 7 is odd. | 

It remains to consider the m-invariants. These are evidently 
given by an expression like that on the right of (49.7), with S 
replaced by M. However, in this case one must exclude the value o 
of a, since otherwise on applying S” to +d, the term @™ will 
fail to be annihilated, and the constant which has been generated 
will then depend upon m. Accordingly, the constants 


Hon—1,a = AM eee (49.8) 


are m-invariants, where a = 1, 2, ..., $m when 7 is even or 4(n—1) 
when z is odd. 

The number of invariant aberrations of order 2n—1 is easily 
counted. In doing so we shall adopt the convention to exclude the 
value o of « also in the case of the s-invariants, on the grounds that 
this relates to the somewhat trivial s-invariance of spherical aberra- 
tion. We thus get Table 5.2 for the number of invariants of orders 
4N +1, where N = 1, 2, 3,.... 


TABLE 5.2 
Absolute Total 
Order invariants s-invariants m-invariants number 
4N—1 I N-1 N-1 2aN-1 
4N+1 ° N N 2N 


There are no invariants or semi-invariants beyond those just 
enumerated. Bearing in mind that the phrase ‘aberration coefhi- 
cients governing a certain type’ is to be understood as relating to the 
pseudo-displacement, any generator must be homogeneous, i.e. a 
linear combination, with fixed coefficients, of operators S"-4J+-”M” 
(u+v = const. = A, say), where 


J = (s—m)? (8/87). (49-9) 


This ensures that in ¢ all terms not annihilated by the generators 


9 BIT 
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have 4 +v = A (cf. Section 21). If this generator be now expressed 
in terms of the adjoints S, M and J, where 


J = —2(0/8€) —(s+m) (0/09) —2sm( 8/28), 
either s or m is required to occur in the resulting expression at most 
through a common factor (s —m)’. It is not difficult to convince one- 
self that this leaves just the generators considered above as the only 
possibilities. 


50. On the absolute invariants 


Using the simple machinery of Section 49 we can now write down 
the absolute invariants a,y_, explicitly in terms of the #2’), From 


(49.6) 
Oyy_1 = ANLON) = (s—m)-24 (4.SM — J2)Nz@™) 


N 
= (s—m)-2N >) (—1)4 224-4) i) SNA J2AYN-AZ@N), (50.1) 
A=0 


For any particular value of A the operators annihilate all terms of 
¢2N) other than that which has w= N+A and vy = N—A. Carrying 
out the differentiations one then gets the desired result 


bya = NMs— ae 3 ( — 1) 22N-AN— A)! (2A) (ANI AGD, weae 


(50.2) 


In particular one has, for N = 1, the well-known invariant 

Os = 2(s—m)* (atl) — 1) = 2(s—m)*(2ps—Py)- (50.3) 
Similarly, when N = 2, 
nq = 8(s—m)* (8th — 208) + 3140) = 8(s —m)4 (8t_— 2tg + 3t11)- (50-4) 


With increasing N these invariants become more and more of 
merely academic interest, if for no other reason than that o4y_, 
involves N +1 distinct coefficients. Effectively the best one can do 
is to write the partial (pseudo-) displacement as the sum of two 
terms, of which the first is 

; cos 8 

ciyna = [2™A9—m)NANI (V1)! gy ap PANNE 


(50.5) 
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whilst the second part is still governed jointly by N coefficients, i.e. 
all those other than #2). This decomposition of the astigmatism of 
order 4N—1 and degree 2N—1 is a generalization of that of pri- 
mary curvature of field into Petzval and astigmatic curvature: the 
former vanishing for all values of s and m if it vanishes for any one 
pair of values. 


51. On the semi-invariants 


The semi-invariants, as explicit combinations of the t%), are obtained 
quite easily by the procedure of the preceding section. Thus, as 


regards the s-invariants, 


Con—i1,¢ = AtS? 221) . 
= (s—m)-% > (—1) 22-4 (3) Sin—a—A J2a JYyfa—Agn) 
A=0 Xr 


In the Ath term on the right only the term of & with w=a+taA 
and vy = a—A need be retained. Then, with o < 20 <n, 


a 


x (N—A4—A)I(2A)H(AN AMM ga (51-1) 
Taking « = 1, this gives exactly the expression (44.11). When 
a = 2, 
Can—1,2 = 8(n— 4)! (s—m)e™ 


x [4(n —2)(n—3) Re—2(n—3) hg + 3h], (51-2) 
in the notation of (44.9). This result will be found to be in agree- 
ment with (44.10). The coefficients which occur in (51.1) govern 
terms in the (2n—1)th-order displacement varying with p and h’ 
according to p2”-2«—-1h’2«, i.e. those relating to astigmatism of degree 
2n—2a—-I >n—-I. 

The m-invariants, finally, are 
Hen—-1,0 = At Mn 224), 


where now 0 < 2a < n. It turns out, not surprisingly, that “4,4 4 
is given by an expression exactly like (51.1) except that in the co- 
efficient 7), , one has to replace a by n—«. 
- The coefficients which appear in 4, are those which govern 
astigmatism of degree 22—1 < n—1. All the invariants and semi- 
9-2 
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invariants, taken together, thus relate just to all the various astig- 
matic types of aberrations. “,,_1,, alone is concerned with linear 
astigmatism. Indeed, temporarily writing 


tr) t™ 


-1n-1=% lan-2 = ¢, 
(s—m)O-NE = c¥, Man a,1/2(n—2)! = fs, 


we have, on the one hand, 

p= (s—m)*"(20-6), (51.3) 
whilst the linear astigmatic partial displacement is 
€;, = 2(s—m) (c+€)ph’?”-* cos 6, €, = 2(s—m) cph’2”-* sin 0. (51.4) 


Now absorb a factor (s—m) in p, and a factor (s—m)-1 in h’. Then 
(51.4) becomes 


6, = (4+ 3c*) ph’ cosG, 6, = (“+c*) ph’ sind; (51.5) 


and the discussion of Section 19, based upon (19.14), may now be 
recalled. In a sense, therefore, is the (2m — 1)th-order analogue of 
the Petzval curvature; nevertheless a closer examination of the 
absolute invariants a,,_, shows them to be also to some extent 
analogous to the Petzval curvature, though in an entirely different 
sense, which derives from the way in which they depend on para- 
meters which define the detailed structure of the system. 


52. Conditional invariants and semi-invariants 


In Section 48 we discussed in detail why relations such as (48.2) 
were to be rejected in the context of s-invariance. Indeed, this rela- 
tion essentially merely restates the s-invariance of p,, taken together 
with the form of the second member of (35.13). On the other hand, 
if p, = o for some value of s (and therefore for every value of s), 
then p. = c*p., so that then (s—m)* p, is an s-invariant. Similarly if 
p, and p, vanish simultaneously for some value of s then (s —m)* pg 
will be an s-invariant; and so on, step-by-step down the set of 
equations (35.13), (44.8),.... Such conclusions are somewhat 
trivial in character. Yet they, when taken in conjunction with 
(46.10), provide a clue towards certain less trivial results. 
Contemplate changes of s alone, so that p = 0 in (46.10). (s—m)4 
X (453 —5,) 1S S-invariant, as we already know, so that, if 45,—5, 


THE SYMMETRIC SYSTEM 133 


vanishes, the second equation of (46.10) shows that (s—m)? 
x (255 — 357) is then also s-invariant; and here we have now only 
coeficients governing one type of aberration. We shall say that 
(s —m)? (255 — 357) is a conditional s-invariant. It will be noted that the 
coefficient of spherical aberration is not involved here, nor is that 
of circular coma, so that to this extent one has a state of affairs 
somewhat less trivial than that discussed above. Furthermore, we 
have now before us a result relating to a comatic aberration, namely 
elliptical coma. When, in particular, 25, — 3s, = 0 for one value of 
s, then it will vanish for all values of s, i.e. in (20.9) k = 2 for every 
value of s, and the comatic flare is generated by ellipses whose 
eccentricity is independent of s. 

Inspection of (46.10) shows that (s—m)?(2s;—38,) is also a 
conditional m-invariant, but the condition implicit in this state- 
ment is a different one, namely that 45, —s, should vanish for some 
value of m. If 4s,—s, and 4s,—s, vanish for some s and m, then 
(s—m)? (25, — 357) will be independent of s and m, and in this sense 
the quantity (s —m) (2s, — 3s,), which relates to one type of aberra- 
tion alone, is a conditional absolute invariant. 

The nature of these results, especially with regard to orders 
exceeding the fifth, is such that it is scarcely appropriate to pursue 
them here in any kind of detail. At any rate, it will suffice to remark 
that, as in Sections 50 and 51, it is not necessary to go through ex- 
plicit equations such as (46.10). Instead one can use the method of 
generators after the fashion of Section 49. Here, however, one 
conveniently adds the following operators: 


S* = (s—m)1(2S+J), S* = — (items), 
M* = (s—m)1(2M+J), M* = (qt 2s 2) ; (52.1) 
K = (s—m)?(S+J+M), K = 5 


Of course S* = (s—m)-1(2S+J), and so on; but we have written 
the adjoints down explicitly so as to show that S* is independent of 
m, M* independent of s, and K independent of both s and m. Also, 
it will be noted that S*+ M* = 2(s—m) K. 


134 HAMILTONIAN OPTICS 


Now let Q be an operator which is a product of the various genera- 
tors, appropriately chosen; that is to say, in such a way that (i) the 
adjoint operator © does not depend upon s and m simultaneously, 
(11) Q has A and at least one of S*, M*, K as a factor, and (iii) Q& 
no longer depends on é, 4, ¢, nor does it vanish identically. Under 
these circumstances the application of Q to the right-hand member 
(49.5) yields a constant independent of s or m or both, as the case 
may be. It follows that Q¢” is a linear combination of aberration 
coefficients which has the same properties; a combination, more- 
over, into which the coefficient of spherical aberration, at any rate, 
does not enter. The independence of s, or m, or both entails that 
the coefficients which enter into Q¢” must transform linearly 
amongst themselves under changes, respectively, of s, or m, or both. 
The condition that certain of the coefficients in question vanish 
leaves the rest as conditional invariants or semi-invariants. 

These remarks may be illustrated by some simple examples. In 
the primary domain © cannot contain any operators in addition to 
A, and so there are no primary conditional invariants of any kind. 
In the secondary domain we only have the possibilities AS*, AM* 
and AK. One finds very easily that 

AS*#9 = (s—m)® [4(455—54) +2(255—35)] 

AM*t® = (s—m)? [2(253— 357) + 4(45¢ — 58)]> (52.2) 

AK¢t® = 2(s —m)? [(455—54) + (255 — 357) + (456 —55)]- 

The three expressions on the right are independent of s, of m, 
and of both, respectively. On the other hand (s—m)8(2s, — 357) is 
not independent of s, though (s — m)* (453 — 54) is. Thus the vanishing 
of 453;—s, for any value of s entails the s-invariance of (s—m)? 
X (255 — 357). We see that we have recovered precisely the results 
discussed earlier in this section. 

One final example, relating to tertiary coefficients, will suffice. 
Inspection of AS.S** shows that (s —m)°(4t;— 3t,) is a conditional 
s-invariant, namely when 6¢,—¢, = 0. AS**#) and ASK? yield 
two further conditional s-invariants. These are (s—m)* (82, —t,) 
and (s—m)*(tg—3¢,,), where now 6f,;—t, and 4¢,—3¢, must both 
vanish (for some value of s). Again, three conditions must be satis- 
fied for (s —m)°(4t) — 32,2) to be s-invariant; and this follows from 
considering AS*Kt™. Finally, upon evaluating AK7# it emerges 
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that if, in addition, 42,— 32, = 0 then 6f,)—f,, is s-invariant. In 
short, under changes of s the six groups 6f,—t,, 4t;— 3t,, 8t, —Zs, 
tg—3ty, 4f9—3f1. and 6t,)—f,, transform linearly amongst them- 
selves ; and this is in fact the case for simultaneous changes of sand m, 
since K is independent of these. The six groups just enumerated, 
multiplied by appropriate powers of s—m are evidently also con- 
ditional m-invariants. The absolute invariant «, is implicit in these 
results, since one has, identically, A = 4SK —S** = 4MK — M*?, 


Problems 


P.5 (i). A system is free from spherical aberration of all orders, but 
circular coma of all orders is present. The circular comatic ratios 
have a fixed value for all orders. Show that 

a os 

= eon 
where c is a constant. 


P.5 (ii). Obtain an analogue of the quantity A(p), intended to relate 
to that part of the aberration function which is responsible for 
distortion. 


P.5 (iii). Show that unless m = n 
1 
| im+1 R(t) dt = 0. 
0 


P.5 (iv). Using (44.10) express the sum 
4(n—2)(n—3) ke—2(n—3) kg + 3h 

in terms of the &,,. 
P.5(v). The primary aberrations associated with a given pair of 
conjugate planes, magnification m, are given to be zero. Show that 
primary spherical aberration and coma can vanish together for 
magnification # only if m2 = 1. 
P.5 (vi). Using equation (4.5) show that the focal angle character- 
istic T' of a single spherical refracting surface is 

| T =K1(1+2Ky)4+a+e’, 
where x = 1—(ae’+ ££’+yy’), and f has been set equal to unity. 
P.5 (vii). Find the spherical point characteristic V‘ of a system 
which produces no aberrations other than distortion. 


CHAPTER 6 


SYMMETRIC SYSTEMS WITH 
ADDITIONAL SYMMETRIES 


A. Reversible systems 


53- Definition of the reversible system 


A symmetric system K will be called reversible if there exists a 
normal plane of symmetry @; or, in other words, which is such that 
the part of K which lies to the right of @ is the mirror image of that 
part which lies to the left of it. For the purposes of this definition 
neither the object and image planes, nor the pupil planes are to be 
taken as being ‘parts of K’. One might equivalently call a sym- 
metric system reversible if there exists a line through the axis of K 
and normal to it, such that K is invariant under rotations through 
180° about this line. This remark will appear less trivial if one notes 
that the alternative definitions are equivalent only on account of the 
symmetries which define the symmetric system. The situation is 
otherwise, for example, in the case of a system consisting of a single 
triangular cylinder; granted that the ‘axis’ is a line parallel to one 
side of the triangular cross-section. At any rate, it may be remarked 
that any symmetric system containing one reflecting surface is 
automatically reversible, granted that its refracting part is traversed 
twice by every ray. In this context the necessary imperfections of 
the imagery which will shortly be described are particularly interest- 
ing; and the reasons for using curved receiving surfaces are 
apparent. 

The reversible system is a generalization of the traditional 
holosymmetric system, the latter being reversible in our sense with 
the additional stipulation that the stop must be central and the 
conjugate object and image planes symmetrically disposed ; in other 
words, that s = 1 and m = —1. When these conditions are satisfied 
we shall call the system completely reversible. However, the absence 
of these restrictions makes the whole problem a good deal more 
interesting. Consequently, since general values of s and m are to be 

[ 136 ] 
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contemplated it is natural to carry out the present investigation on 
the basis of the angle characteristic. Except in the last part of Section 
56 we therefore suppose K throughout not to be telescopic: 
f +00; and we shall generally set f = 1, as usual. 

We shall call @ the central plane, and denote its axial point by C. 
Although we are ultimately interested in the angle characteristic 
T(é, 7, €) referred to the base-points O, and Og, the present context 
dictates a temporary choice of base-points which are symmetrically 
disposed about C. Accordingly we choose Oj as posterior base- 
point, whilst as anterior base-point we take the axial point O* 


Fig. 6.1 


which lies as far to the left of C as Oj lies to the right of it. The 
corresponding angle characteristic, regarded as a function of 
£,7, ¢, will be denoted by T*. Now let 8, B’ be the values of the ray- 
coordinates of some ray # through K. With @ we can associate the 
ray #* which is the ‘reflection’ of Z in @. The values of the ray- 
coordinates of Z* are obviously — 8’, —®. The optical distances 
between the feet of the normals drawn from the base-points on to 
the rays # and &* are clearly the same for both, since the ‘original’ 
and the ‘reflected’ systems are indistinguishable; see Fig. 6.1. 
(We have chosen spherical refracting surfaces merely for the sake 
of illustration.) It follows that the basic identity 


T*(E,7, 6) = T*(6,7), 6) (53-1) 
obtains; and this must ultimately yield all the special properties 
which reversibility entails. It should be added that (53.1) is a neces- 
sary but not a sufficient condition for reversibility. We shall see 
later, in Section 64, that concentric systems satisfy (53.1), but they 
need not be reversible. 
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The distance d* = O, OF is easily calculated. If, for the moment, 
we set s = 1, Eand £’ will be symmetrically situated with respect to 
C, and Of E = E’O, = d’. Then d* = d—d’, or 

d* — —(1—m)/m, (53-2) 
from (14.27) and (14.28), with s = 1. 

We shall develop the theory of reversible systems ab initio, 
that is to say, for general values of s and m. To a certain extent this 
entails some redundancy, for one could confine oneself in the first 
instance to the particularly simple case of complete reversibility 
(Section 59). Subsequently general values of s and m are then 
accounted for by using the results of Section 46, relating to joint 
changes of s and m. 


54. The basic identity (m + 0) 

The stage has been reached for reverting to the usual base-points 
Op, Op. It is convenient for the time being to require that m be not 
zero, reserving the special case m =o for separate treatment in 


Section 60. Then T = T#+d*a, (54.1) 


Incidentally, for once we need not contemplate the use of reduced 
distances, on account of the fact that, necessarily, N = N’; so 
that we can set N = N’ = 1, without loss of generality. 

Now let &, 7, € go over into £, 97, € when Z and € are interchanged. 
Then (53.1) and (54.1) jointly imply the basic identity 


T(E,9,6)—4¥(1 —&)t = TE,9,8)—a*(1-B)t. (54.2) 
By virtue of (53.2) the linear terms of this mutually cancel. Then, 


once again absorbing all non-linear terms of g(€) in the aberration 
function, (54.2) becomes 


tn, 9,6)— Kno” = ur, q ¢) —K, 8", (54-3) 

1 
where K, = (—1)” (?) d*. (54.4) 
The expressions for £, #, Gin terms of £, 7, € may be obtained directly 
from the results of Section 43. Here @ = 0,b = —1,@ = —1,d =0. 


Then, for instance, 


£ = (s—m)-* [(1 — sm)? —2(1 — 5)? (1 —sm) 9 +(1—-8*)°],_ (54.5) 
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with two similar equations for 7 and €. These, and (15.12), now 

need to be inserted in (54.3); and then the required relations be-. 
tween the aberration coefficients may be read off. However, it is 

immediately obvious that such relations will generally involve the 

coefficient #). This, on the other hand, does not enter into the 

displacement (regarded as a function of the variables p and a 
of Section 24). In other words this coefficient is to be eliminated, and 

this gives rise to tedious and irrelevant manipulations. We shall, 

indeed, see in the next section how they can be avoided. 


55. Introduction of auxiliary variables 
To circumvent the need to remove the unwanted coefficients 
t™ from the relations between the aberration coefficients which are 
implied by the reversibility of K, we introduce the auxiliary vari- 
ables L, M, N, defined as 
(1m)? L = mE —2mia+ 6, 
(1 —m?)? M = mE —(1 +m?) +m, (55.1) 
(1 —m?)2.N = £—2my +m, 
the inverse relations being 
£=m*L—2mM+N, 7 = mL—(1+m?)M+mN, 
€= L—2mM+m2N. (55.2) 
Here we have supposed that m? + 1, reserving the very special case 
m® = 1 for separate treatment in Section 59. Then, temporarily 
ne u“=1-sm, V=s—m, w=1-—m?, (55-3) 
we find that 


E=wL+2uvM+vN, y=w(ulL+vM), €=wL. (55.4) 
The left-hand member of (54.3) now becomes 
C'(wWL +2uvM +N, w(uL +vM), w*L) —K,(L—2mM + m2N)*. 


(55-5) 

It will be seen that through the introduction of the auxiliary 
variables L, M, N a situation of considerable simplicity has been 
attained. Thus, the basic mutual interchange of £ and € merely 
implies the interchange of L and N. This entails that €, 9, € arise 
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from £, 7, € merely by interchanging L and N. Consequently the 
required relations, i.e. those which do not involve 7), are just 
those which express the equality of the factors of L”~““M+#~N’ on 
the one hand and of L’>M+-’*N*-- on the other in (55.5), where 
f+oorn, 


56. The third-order relation (m* + 1) 


As usual, in particular in Section 44, we write 


t= pO? + pabytPsbo+ Pa t+Psmo+Pes®, (56.1) 
i.e. the initial factor (s —m)—! on the right of (19.24) is again omitted. 
(The Seidel coefficients are then given by 
O,=4(s—m)py, ..., 5 = (S—m)p;.) 
The expression (55.5) reads, for = 2, 
Pil + 2uvM +e°N)* + p.w(w@Lh + 2uvM + v*N) (ub +vM) 
+p,w(uL + 2uvM +v2N)L+pw(ul +vM)? 
+p,w(uL +vM)L+p,u'l?+4d*(L—2mM +m?N)*. (56.2) 
Evidently only one relation is obtained from this, namely that which 
arises from the equality of the factors multiplying LM and MN 
respectively; but let it be recalled that m? + 1. ‘Thus 
4uv(u? — v*) p, + vw(3u? — v?) p, + 2uvw*(ps + Pa) 
+vwp; = 4d*m(1—m?). (56.3) 
Using (53.2) and (55.3) this becomes 


4(1 —s*) (1 —sm) p, + [3(1 — sm)? —(s—m)"] Dp 
+2(1 —m?) (1 —sm)(p3+ pa) + (1 —m*)?ps = — 3(1 —m*)/(s—m). 
(56.4) 


Finally we go over to the Seidel coefficients, and then 
(1 —s?) (1 —sm) 0, + [3(1 — sm)? —(s —m)?] og + (1 —m*) (1 — sm) 
(303+ 04) +(1—m*P os= —3(1—m*). (56.5) 
As a consequence of reversibility one therefore has one relation 
between the five Seidel coefficients. It is remarkable for the fact 


that, provided m? + 1, it is inhomogeneous. In other words, when 
m* + 1 the primary aberrations of the symmetric reversible system 
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cannot be simultaneously reduced to zero. We shall see later that an 
analogous conclusion holds for the aberrations of any order. 
When the stop is central (i.e. lies in @) s = 1, and (56.5) reduces to 


20g +(1+m) (303+ 04)+(1 +m) os = —H(1+m)/(1—m). (56.6) 
Although we have supposed that m +0 we may let mo here, 


with the intention of confirming later (in Section 60) that the result 
so obtained is correct. One thus gets 

202+303+ O4t 05 = —§. (56.7) 
This equation relates just to the conditions for which photographic 
objectives have sometimes been designed in the past. It is of course 
valid irrespectively of the particular structure which K may possess. 
There is no indication that ‘coma should be small’. Indeed, if 
tangential curvature of field and distortion have been removed, one 
is necessarily left with coma of amount 


O2= 4, (56.8) 
where of course a factor f-? has to be inserted on the right if f + 1. 
Similarly, if the system has been designed so as to produce a sharp 
image, then one will be left with distortion of amount 


5 = —4; (56.9) 
see also Section 62. At any rate one should bear in mind that by 
adhering strictly to reversibility one severely limits oneself from the 
outset with regard to the extent to which satisfactory imagery 
may be attained. 

Although p, has so far been left out of account it is worth re- 
marking that its value is determined by that of the other aberration 
coefficients. This is fortunate, bearing in mind that p, enters into 
the effective higher-order displacements. From the equality of the 
factors multiplying L? and N? respectively in (56.2), we find that 


(1 —s*) (w?+ 0") p, + u5p, + ua(py + py) + up, + wp, 
= 4}(1—m"')/m. (56.10) 
It may be noted that for any given value of m a particular 
value of s gives results of great simplicity; see also Section 58. 


It is such that E’ lies as far to the right of C as O, lies to the left of it. 
This condition requires that d’+d* = 0, i.e. s = 1/m, or 


u=0.- (56.11) 
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In that case (56.5) reduces to a condition involving o, and.¢;, 
i.e. the comatic coefficients, only: 


o,[m? —o, = #(1 —m?)-. (56.12) 


Finally, we must again draw attention to the fact that K has 
throughout been supposed to be non-telescopic. In the contrary 
case, i.e. when f = 00, we must abandon the angle characteristic. 
Consider therefore V, taken with respect to base-points O, and B’, 
where B’ shall lie as far to the right of C as Oj lies to the left of it. 
It is assumed that the point so defined does not happen to be con- 
jugate to O,. Under these conditions the linear function on the right 
of (14.37) must be invariant under the mutual interchange of § 
and ¢, which entails that Si 

m* = 1. (56.13) 


It looks at first sight as if we were confronted with the state of. 
affairs to be studied separately in Section 59. This is, however, not 
so: in the non-telescopic situation the case m2 =1 is ‘special’, 
because O, and Oj are then equidistant from C, whereas this need 
not be so here; we have, in fact, just excluded this special case. 
Since the ideal point characteristic is unaffected when & and ¢ are 
interchanged, reversibility simply requires 


vr", 9,6) = UME, 7), &). (56.14) 
In particular, in the primary domain, o, = 05. 

We are naturally inclined to wonder whether this result could 
not have been arrived at from those already obtained by means of 
some limiting process. It is instructive to see how this may, in fact, 
be done. Let f therefore be assumed to be large but finite. Then 
the present choice of B’ implies that d’ = (1 — m*) f/m. Accordingly, 
let e be a parameter which is ultimately to tend to zero, and set 


m=1+6, s=i/m, f? =[(1+6)/e] da’; (56.15) 


in harmony with the relations between s, m, f and d’. Restoring 
the factor f-? explicitly in the right-hand member of (56.12), the 
latter equation becomes 


0-05 =(0,—4d'~)e. (56.16) 


On allowing € to tend to zero, one recovers the stated result 
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o, = 0. We are thus in the fortunate, but hardly surprising, posi- 
tion of being able to deal with telescopic systems by means of this 
kind of limiting device. 


57. The fifth-order relations 


The secondary expression analogous to (56.2) is obtained by sub- 
stituting from (55.2) and (55.4) in the polynomial 


5 E9 + 5.879 +... + 54963 + yd ¥E3. 
The fifth-order relations are those which express the equality of the 


factors multiplying (i) L?4/ and MN?®, (ii) L?N and LN?, (iii) LM? 
and M2N, respectively. They are 


(i) 6uv(u? + v*) (1 —s*)s, +.u(5ut—v*)s,+ 4uvw(ss +54) 
+ 3u*vw(s5 +s) + 2uvw*(s6 + Sg) 
+vewisy= —3(1 —m'), 
(ii) 3u?v?(1 —s) 5, -+ uv*(2u? — v?) 55 + veo(2u? — v?) 5 
+ wos 4 + uv*qs, + v* ws, = =3,m(1 —m?), (57-1) 
(i11) 12u°v?(1 — 5?) 5, + 4uv?(2u? — v?) so + 4u?v*ws, 
+ v20(5u? —v?) s, + uv*w(25, + 352) 
+vwis. = 3m(1 —m?). 


By subtraction one finds from the last two equations that 


(1 —s”) (483 — 54) + (1 —sm) (255 — 357) + (1 —m?) (45g—5Sg) = 03; (57.2) 


but no further substantial simplification appears possible. It is 
interesting, though not very surprising, that the (conditional) in- 
variants which were studied in Section 52 reappear here. At any 
rate, at least one of the relations must be inhomogeneous, so that not 
all fifth-order coefficients can be simultaneously reduced to zero. 
When the stop is central and m is allowed to tend to zero, one gets 
the secondary counterparts to (56.7). After some simplification 


387 + 35g 4 28 = —F, 
255 + 456 — 357 —Sg = O, (57-3) 
Sgt Sgt Sg+55 + S¢ =0. 
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When the characteristic comatic coefficients have been reduced to 
ZEIO, 1.€. Sy = S5 = Sp = Sy One is then left with 


ae eee tt oo 
Sg+Sg3=Te Se=—-te Ss=—-4 (57-4) 


with factors f* on the right when f + 1. Under these conditions 
the secondary meridional pseudo-displacement is known exactly. 


In f: t , ’ , ; 
ss e! = £f*(2p%h'?— sph"). (57-5) 


When amongst the characteristic secondary coefficients that of 
distortion alone is non-zero it must have the value 


Sg = — Bf. (57-6) 


The following point is of interest. Suppose that all the primary 
and secondary aberration coefficients other than those of distor- 
tion vanish. Then, according to (24.13) the effective coefficients 
ps, sf, s®, SF will not be zero for general values of s and m, bearing 
in mind that pg + 0, as we saw previously. The system can therefore 
not form a sharp image in.¥’, except possibly when m = 0, for then 
5* = s* = o. We therefore have the interesting questions whether 
(when m? + 1) the system can at least form a sharp image in some 
surface other than .%’, and whether this surface, if it exists, can ever 
be plane. The answers to these will be provided in Sections 62 
and 63. 


58. The relations of all orders when s = 1/m 


One can proceed to the seventh and higher orders after the fashion 
of Section 57, but it is hardly feasible to write out in full the relations 
of order 2n—1 for general values of s and m. However, it may be 
reflected that the effects of stop shifts on the coefficients of the angle 
characteristic are rather easily accounted for, as we saw in Section 
44. Consequently, we may choose some convenient value of s 
and find the general relations for this; and, if required, the stop 
may finally be shifted so as to correspond to the actual value of s 
which may happen to be of interest. This procedure would thus be 
a special case of that outlined at the end of Section 53. 

In view of the simplicity of (56.12) we choose s = 1/m, i.e. u = 0. 


Then, defining a set of purely numerical coefficients 9) (= BO” ,n— p 
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by the equation 
(L—2mM + m?N)" = > bmn Ne-H#MEOL, (58.1) 
Hv 
the left-hand member of (54.3) becomes 
D [eo?m—2n+ — BD] N-# Me Ly, (58.2) 
Hv 
where A = 4 +v; bearing in mind that now v = w/m. Excluding 


again the case m? = 1, we immediately read off the whole set of 


required relations, 
ne 
ae 2 = — ‘a 


(58.3) 


Here A = 1, 2,...,2—1, the value A = o corresponding to a relation 
which involves the unwanted coefficient 1. Evidently, as A in- 
creases, successive relations involve just two coefficients at a time, 
both of which are alternately of comatic and astigmatic type; 
specifically, circular coma and distortion when A = 1, oblique 
spherical aberration and linear astigmatism when A = 2, cubic 
coma and elliptical coma when A = 3, and so on. Since the number 
of relations of any order is not affected by the value of s, we can 
obtain it directly from (58.3) by counting the number of allowed sets 
of values of « and v. A, we know, ranges from 1 to m—1. For any 
value of A, 4 can then range from o to 4A or 4(A—1), according as 
A is even or odd. Thus, when z is even, there are 2+3+...+4n 
possibilities corresponding to even values of A, and 1+2+...+ 3” 
corresponding to odd values. Proceeding similarly when n is odd 
we find in this way that the number of relations of order 2n—1 is 


}(n?+2n—4) whennis seis 


8. 
and }(n—1)(n+3) when nts odd. (58-4) 


Since the number of coefficients is 4n(n+ 3), the number of in- 
dependent coefficients of order 27 — I is 


d(n+2)? when n is even, 
} (58.5) 


and d(n+1)(n+3) when nis odd. 


Thus there are only 4, 6, 9, 12, ... independent coefficients of orders 
3557s Ovcees 


Io BIT 
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59. The relations of all orders when m? = 1 


The time has come to investigate the consequences of reversibility 
when m?* = 1, for this case has hitherto been strictly excluded. d* 
now vanishes, and (54.3) reduces to 


tO E, 9, 6) = UE, 9, 6). (59-1) 

Using (43.4) and (43.5) one gets the simple result 
E=E-2int 7h, F=—n+i6, €=6 (59.2) 
where j=tts when m=+1. (59-3) 


A new important feature arises here on account of the fact that 
t™ disappears from (59.1), so that the need to eliminate it does not 
arise now. Accordingly one has in every order one additional rela- 
tion, as compared with the number of relations which obtains when 
m* + 1, i.e. that given by (58.4). 

The primary coefficients turn out to be subject to the relations 


2jPi + Pz = 0, 
J*Pa+ 2j(P3t Pa) + 2Ps = O. 


The first of these corresponds to (56.4), but the second is the 
additional primary relation just referred to. There are no inhomo- 
geneous relations now, so that there is no a priori reason against 
the achievement of perfect imagery. 

The most interesting situation is perhaps that of complete 
reversibility, s = 1 and m = —1. Then (59.2) shows that ¢™(, 7, €) 
must be invariant under the reversal of the sign of 7. Therefore the 
reversibility conditions are here simply 


(59-4) 


ti =o when p—v is odd. (59-5) 


However, when 4 —v is odd so is 4+, and reference to Section 22 
tells us that this means exactly the vanishing of all characteristic 
coefficients governing aberrations of comatic type. In particular, 


i) in third order: = pp = 0, 
() ane ads 


(ii) in fifth order: {n= Sp = Sy'= Sy = 0, 


and so on. 
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The following remarks concerning the limit m-—> —1 (when 
s = 1) may not be out of place. It is clear that, on account of the fac- 
tors (1 — m*)? which are present on the left of (55.1), this limit is not 
as tractable as the limit m— 0. Suppose one simply sets m = —1 
in (56.6) and (56.10)—with s = 1 in the latter—then the first of 
these gives p. = 0, but the second gives p. =o again, and not 
ps = 0 as one might possibly have expected. There is, however, 
no inconsistency, for, by simple subtraction, po, ps, py may be 
eliminated between the two equations in question. If a factor (1 +m)? 
be removed from the resulting equation, a factor which becomes 
zero as m —> —1, one is left with 


Pst 2(1 +m) py = Z(t +m)/[m(1 —m)”I, (59-7) 


and now, as m -> — 1, one gets p; = 0. 

A few words may also be added concerning the procedure outlined 
at the end of Section 53. It will be amply sufficient to do so in the 
context of the primary coefficients, since the generalization to all 
orders is obvious. In the equations of Section 46 we set $= 1, 


m = —1, so that the p, are the ‘known’ coefficients; and 

bz = ps = 0. (59.8) 
Then p=(tt+m)/(t+s), g=(t-s)/ =) oe 
| = i(1—m), c= 21+), 
and ji = —zaa(t +m). (59-10) 


(59.8) then imply two relations between the p,, and when zg is 
eliminated between them, one gets just (56.4); though the work 
involved in this process is quite tedious. If we take s = 1, however, 
the situation is somewhat simpler, for then q = o. In fact one finds 
at once that £, = ois exactly equivalent to equation (59.7). 


60. The relations when the object is at infinity 


Infinite object distance corresponds to m being zero. In Section 56 
we already dealt with this case by letting m tend to zero from finite 
values. We want to confirm that the results thus obtained are in fact 
correct, limiting processes being now excluded. For this purpose 


I0O-2 
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we have to choose a new anterior base-plane, and we take this to be 
é. According to equation (15.17) we then have 


T = 9(f)+s—y(1 —6) 4+ 2(€, 9, €). (60.1) 


The distance OFE may be found in the same way as d* was deter- 
mined at the end of Section 53, and it turns out to be 1/s. Hence 


T* = T+s-la. (60.2) 
If all non-linear terms which depend upon ¢€ alone be now ab- 
sorbed in ¢ as usual, 
T* = g.—7H(1 —) 4 +4(E, 9, 6), 
with E=s%—as7+0, n=—sgtt, C= €. 
Accordingly 


t(SPE + 257+ 6,7 +86, 6) +(—1)" i) m6" (60.3) 


must be invariant under the mutual interchange of & and &. Here 
the sign of 7 has been reversed as a matter of convenience. This is 
certainly permissible since the invariance of the expression (60.3) 
is indifferent to the sign of 7. 

(60.3) may now be compared directly with (55.5). To the latter 
we arbitrarily add a term x, L” on the formal grounds that, as long 
as we disregard the factors multiplying L” and N*, all conclusions 
relating to factors which do not involve ¢™ will be unaffected. 
Proceeding to the limit m = 0, one has, first, that u=1, v=s, 
w=1. Again 


limk,, [((L —-2mM + m?N)" — L”] 
m-—>0 


1 
= lim (—1)""1 (*) (1 —m?)m— (—2anmL"1M +...) 
m—0 


= —2(-1)" (*) nL?41M = —(—1)r4 (,*) LM. 


If one now formally identifies L with ¢, M with 7, and N with &, 
the limit of (55.5), modified by the addition of x, L”, is identical 
with (60.3). It follows that all the results previously derived from 
(55.5) with m + o will give valid results in the limit m > 0; excepting 
of course any relations containing the coefficients 7”). From (60.3), 
with the new anterior base-point, the relation giving 2 is 

SAD = sng, (60.4) 

fav 
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When 7 = 2 the same result follows from (56.10) if one simply 
omits the right-hand member from it. This formal device again 
reflects the difference between the present angle characteristic and 
that contemplated in Section 54. 


61. Remark on the effective aberration coefficients 


From very elementary geometrical considerations one infers easily 
that when the system is completely reversible the ray from O which 
passes through C, i.e. the central point of the stop here, also passes 
through J’. For this reason one traditionally states that a completely 
reversible system produces no distortion at all. Let us therefore 
investigate this result in the light of the results obtained above. 

In the primary domain, since p, = 0, the effective coefficient of 
distortion p;* also vanishes. Next, going on to the last member of 
(24.13), we find in the first place, since sy = p; = 0, that 


5g = 2a[8(p3+ Ps) Pg + 24(P3+Pa)I- 


However, since we are no longer including the factor (s—m)—} 
which appears on the right-hand side of (24.1), we have to supply 
every primary coefficient in the equation just quoted with an 
additional factor s—m (= 2, here). Also, a = 4, so that we finally 


get i = 4(Da+Ps)(32P6+1) (61.1) 


for the effective fifth-order coefficient of distortion; and this in 
general fails to vanish. 

The contradiction is only apparent, for it is brought about by 
diverse definitions of distortion. The traditional definition relates 
to the displacement of principal rays, i.e. rays through the axial 
point of the stop. In the work above we have throughout taken dis- 
tortion to refer to the displacement of rays through the axial point 
of the paraxial exit pupil, and occasionally called these principal 
rays. The two alternative definitions are clearly in conflict with one 
another unless spherical aberration has been removed for the pupil 
planes. 

To illustrate these remarks let us suppose that, in fact, not only 
spherical aberration, but all aberrations associated with the pupil 
planes have been removed. The angle characteristic referred to 
Eand E’ as base-points must therefore be a function ¢ of £ — 27+ 
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alone, since s = 1. Hence the angle characteristic referred to Oo 
and O, as base-points is 


T = f(€)+2(1 —£)8 + 2(1 — 3, 
since d = d’ = 2; that is to say, 


T = o(€)+2[1-4F(—-29+ 0) +2[1-fE+27+O)]}2. (61.2) 
Under these conditions we therefore know the values of all aberra- 


tion coefficients except those of spherical aberration. In particular, 
if d = Go t+ $1E+ 267 +..., we have 


t) = (hb. — gy) 8? —ga(286 + 49 + 6”); (61.3) 
1.€. P2=9, pg = Fe: Pi = =; Ps =9, Pe= — ge (61.4) 


This value of p, correctly reduces s¢* to zero. 

There will, in general, be residual comatic displacements of the 
fifth and higher orders (cf. Section 25). In other words, complete 
reversibility of K does not guarantee the absence of coma of all 
kinds. Elementary demonstrations of the ‘absence of coma’ do 
not conflict with this conclusion, since they rest largely upon the 
consideration of certain meridional rays which intersect in the 
object space in points not in %. A precise interpretation of such 
results is therefore difficult to attain, and it is hardly worth pursuing 
this topic any further. 


62. On the attainability of a sharp image when m? + 1 


The time has come to answer the question raised at the end of 
Section 57. It was this: granted that m? + 1, can a reversible system 
form a sharp image at all, and if so what can be said about the nature 
of the image surface? It is advisable to treat the two cases m +0 
and m = o separately, and we proceed to deal with the first of these. 
For this purpose it appears best to use the point characteristic, 
associated, as usual, with the base-points O, and FE’. The value of s 
is clearly irrelevant here, and we choose it in such a way that E’ lies 
as far to the right of Cas O, lies to the left of it, i.e. s = 1/m, accord- 
ing to (56.11). d’ + o since m* + 1, and we therefore set it equal to 
unity as usual. 

Take Cartesian axes with origin at Oo, and let the coordinates of 
points on the image surface .%*’ be X’, Y’, Z’; so that Y’ does not, 
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in general, refer to.7’ now. Formation of a sharp image lying in. ¥*’ 
of a plane object in .% requires that all rays from any point O(y) 
of ¥ shall intersect in just the point O’(X’, Y’) of 4*’. If this is, 
indeed, to be possible, there must therefore exist functions C(¢) 


and D(¢)such that X'=C(O, Y' =yD(O, (62.1) 
&, 9, ¢ being the rotational invariants (13.1). It is obvious that one 
mnueenaye C(o) =0, D(o) =m. (62.2) 


We are evidently looking for an ideal point characteristic which is, 
for once, defined with respect to a curved image surface. We can 
simply transcribe (7.4) into the present context. Because of sym- 
metry, g(y, 2) must be a function of € alone, say g(€), as usual. d’ 
becomes 1 + C(€) and in place of my we must take yD(¢). Hence, at 


ee V = 9(6)-[d, )-29D(OB, (62.3) 
where P(E, 0) = 1+ £4 2C(6) + C1L) + EDX). (62.4) 


It remains to impose the condition of reversibility. Since % and &’ 
are, by choice, symmetrically situated with respect to @, (62.3) must 
be invariant under the mutual interchange of y and y’, or, what 
comes to the same thing, of and ¢. In other words the equation 


8(6) —[ P(E, $) — 2yD(C)]* = (6) —[9(6,6)-—29D(E)}t (62.5) 


must be zdentically satisfied. Write both members of (62.5) as series 
in ascending powers of 7. Then the factors multiplying 7” show that 
one must have 


pen-(G, £) D?() = pee -*(E, €) D?n(é). 


It suffices to take n = 1 and 2 in turn; and we conclude at once that 


D(é) = D(€) = const. = m, (62.6) 
and A(E, 6) = (6, 8). (62.7) 
It then follows from (62.5) that 

g(&) = a(¢) = const. = go, (62.8) 


say; whilst, in view of (62.4), the identity (62.7) reads 
(1 —m*) § — 2C(E) — CX) = (1 — m2) €—2C(€) — CXS). 
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This is possible only if both members of this equation are equal to 
the same constant. The latter must be zero on account of (62.2), 


aa C(O) = b+ (1—m®) C1. (62.9) 
The point characteristic V is thus fully determined: 

V =g)—(1+ §-2my + €)i. (62.10) 
Finally, from (62.1), (62.6) and (62.9), there follows the equation 
for PRs (X14 (1 —1/m)(Y24Z) = 1. (62.11) 


£*' is therefore an ellipsoid or hyperboloid of revolution, according 
as m2 > 1 or m? < 1. Since m? + 1, it is never a plane. It should not 
be forgotten that the unit of length was so chosen that 


(1 —m?)f /m =1. 

It remains to investigate the case m = o which has hitherto been 
excluded. To this end we revert to the angle characteristic T* of 
Section 53; and this coincides here with the focal angle characteristic 
T of Section 47. The equation of .%*’ shall be 

X'=C(é), Y' =pD(d), (62.12) 
where the functions C and D are of course not the same as those in 
(62.1). Taking again f = 1, the counterparts to (62.2) are 

C(o) =0, D(o)=1, (62.13) 
bearing (15.14) in mind. Now 
Y’ =z y +X'8' Je’, 


so that one must have 


oT* Se onfs 5 
7p" = C(S) B'/a’ — D(S)B 
which on integration gives 
T* = —(1—£)8 C(Z)-9D(E) +(6); (62.14) 


where g is some function of € alone. Reversibility of K now requires 
that the identity 


(1 —2)# C() + 7D(E) —a(€) = (1-6/8 CE) + PDE) —g(E) (62.15) 
be satisfied. In the first place one concludes at once that 


D(é) = const. = 1. (62.16) 
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Next, expand (62.15) in ascending powers of &, writing 
CE) =4kE+..., and g(f)=gotgis+.... 


One needs to consider only the terms independent of, and linear in, 
£, and these give rise to the two equations 


CE) = R(r—C)t+2g1, a(S) = C(E)+80- (62.17) 
Setting € = o in the first of these, it follows that g, = —4k; so that 
C(B) = A(t —Bt—1]. (62.18) 


Thus, now, T* = g—7-Al(1—Z#-W] [1-H]. (62-19) 
The equation of 4*' follows from (62.16) and (62.18). It is 
(X'/k4+1)?+ Y?2+2Z2" = 1. (62.20) 


The image surface is therefore now in general an ellipsoid of revolu- 
tion. Only in the exceptional case when k = o does this degenerate 
into a plane. However, one is of course still left with severe barrel 
distortion, for the displacement is then 


e’ = B(1—1/e) = y,[(1 + i+ 29) 4-1). (62.21) 
The actual image height H’ thus stands to the ideal image height h’ 
in the ratio (1+/’2)-3; and as h’>oco, H’->1; and generally the 
system transforms a line in the y, z-plane at infinity, whose ideal 


image is at a perpendicular distance y, from the axis of K, into the 
arc of an ellipse of eccentricity (1 +2)-?. 


63. Generalization to curved object surfaces 


The results of the preceding section surely suggest that one should 
also consider curved object surfaces ; and to inquire in that case again 
into the attainability of a sharp image. Take m + 0, as before, and 
let the object surface %* have the equation 


x = K(é), (63.1) 


where € = y?+ 27, and y denotes coordinates in %*. The equations 
of %*’ are (62.1) again. Also, we take s = 1/m as before. However, 
instead of &’ as posterior base-plane, we take a curved base-surface 
B*' which is the ‘reflection’ of %* in @. This surface evidently 
passes through E’. It has the advantage that the point characteristic 
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V, must be invariant under the mutual interchange of £ and {, 
y’ being now coordinates in #*’, of course; but y = const. still 
characterizes a given point of the object. In terms of the coordinate 
system to which X’, Y’ refer, the equation of Z*’ is 
x’ = —1 — K(é). (63.2) 

Hence by the usual argument 

VW = a(S) — {1 + K(E) + C(O) P +8 — 29 D(C) + CD%(E)}#. (63.3) 
Expanding this in ascending powers of 7, invariance of V, under the 
mutual interchange of £ and ¢ requires that g(¢), D(¢) and Ki(E, 0, ¢) 
be separately invariant. Thus 


g(¢) = const. = g9, D(¢) = const. = m, (63.4) 
and 


[x+K(E)+C(C)P+S+m°e = [1+ K(C)+C(E) P+ S+ mE. (63.5) 
Now expand this in powers of &, writing K(€) = K,é+..., and 
C(é) = C,£+.... The terms independent of, and linear in, & give 

{r+ C()]}? + (m?—1)¢ = [1+ KO), (63.6) 
2K,[1+ C(Q)] +1 = 2C,[1 + K(C)] + m?. (63.7) 


On setting € = o in (63.7) it emerges that K and C are constant 
multiples of each other; say 


- K(0) = RC(). (63.8) 
Then (63.6) yields 


C0) = | (1-APE— ea]. 6.09) 


The explicit form of V, is now at hand. To write it down compactly, 
put temporarily 
1k? 2k oe (1-+k) (m? 

I- 


a ~1) 
re? O-GyRP R ? (63-10) 


and then 
Vs = gy— [1 +a(E +£)—2miy + f(t —c€)# (1 —c6)— 1]. (63.211) 


This is of course invariant under the mutual interchange of € and 
¢. We are now in a position to write down the equation of %* 
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and the equation of %*’ conjugate to it, 


[1 +a) x’ pe SEAT IM) yey 2 = 1, (63.13) 
In general, therefore, either both surfaces are ellipsoids or they 
are both hyperboloids. A given ellipsoid can evidently have a sharp 
image only for one value of m?. When .¥* is spherical, then so is 
J*’, and this situation obtains when k?m? = 1. At this point we 
restore, for the moment, an arbitrary unit of length, so that 


d! = m-"(1—m?)f. 


Then the centres of these anterior and posterior spheres are at 


x= —(m"F1)f and X’ =—(1}m)f respectively, according as 
k = +m; and in either case their radii R, R’ obey the relation 
R’ = |m|R. (63.14) 


The sharp imaging of the two alternative object spheres can only be 
effected by distinct systems. 

The last remark requires a careful comment. To this end, con- 
sider more generally the conjugate surfaces corresponding to (k, m) 
having the values (,, 7), say. Reversibility of the system entails 
that if one takes as new object surface the ‘reflection’ of the previous 
image surface in @, then the system will produce a sharp image of 
this, namely, the ‘reflection’ of the original object surface in ¢. 
In other words, the conjugate surfaces corresponding to 


(k, m) = (1/Ry, 1/my) 


are conjugate with respect to the same system. (63.12) and (63.13) 
are indeed consistent with this elementary conclusion, since these 
equations in effect interchange place when k and m are replaced 
by 1/k and 1/m respectively. Again, contemplating the usual angle 
characteristic, the sharp imagery certainly requires that the co- 
efficients of primary spherical aberration and coma vanish, i.e. 
Pi = P2 = 0 for given (k,m). Then, however, 6, = 6, =0 must 
also hold for (1/k, 1/m), and one confirms by means of (45.9), with 
p =m+1, that this is the case in virtue of the relations (56.4) and 
(56.10). Now return to the two pairs of spheres considered above. 
To the first pair, the magnification being m, there corresponds 
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another pair at magnification 1/m. The centres of the object spheres 
are, as we have seen, at x = —(m-1—1)f and x = —(m—1)f, re- 
spectively. As regards the second pair, one has again to consider 
two object spheres, namely atx = —(m-1+1)fandatx = —(m+1)f. 
If now the given system were to form a sharp image of all four 
spheres, the coefficients of primary coma would have to vanish for 
four distinct values of the magnification. This, however, is impos- 
sible, since, according to (45.9), the equation p, = 0 is cubic in p, 
bearing in mind that 5 + 1. 


B. Concentric systems 


64. Definition of the concentric system 


A system K will be called concentric if there exists a point C such 
that an arbitrary, sufficiently small, rotation of K about C results in 
a system indistinguishable from K. It goes without saying that, for 
the purposes of this definition, neither the pupil planes nor the 
object and image planes are to be reckoned as ‘parts of K’. We 
choose any line through C as the axis of K, and to this ¥ and .¥’ 
are normal. Obviously K is symmetric in the usual sense with respect 
to Y. C will be called the centre of K, and the normal plane @ 
through it the central plane. In practice one usually only utilizes, 
indeed only constructs, that part of K which, roughly speaking, 
is contained within a circular cylinder of which is the axis; but 
this is a feature which need not concern us. One need only reflect 
that if Z is some ray, not too distant from , which passes through K, 
then & ‘does not know’ whether it is in fact passing through K or 
through the rotated system K, provided the rotation was not too 
large. At any rate, the concentric system has the highest degree of 
symmetry any system can possess; as a consequence of which the 
characteristic function is determined to within one unknown func- 
tion of a single argument, as we shall see. 


65. Generic form of the angle characteristic 


We shall generally proceed in terms of the angle characteristic T, 
granted that K is non-telescopic (see, however, Section 70). 
Then the most general form of 7 consistent with the central 
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symmetry of K is to be determined. This problem will be solved 
first by a relatively long and somewhat formal argument. Once the 
result is at hand we shall recognize how it can be obtained in a 
rather more elegant fashion. 

Let the base-points B, B’ be arbitrarily selected for the time being, 
and let them be situated at distances q and q’ to the left and right of 
C respectively. The coordinate axes are disposed in the usual way. 
Then let < be rotated through an angle % about a line through C, 
normal to the meridional plane; this rotation being equivalent to a 
rotation of K through the angle —7. As a consequence, B, B’ go 
into B, B’, and the direction cosines ®’, B of a given ray Z into 
8’, 8. If Tis the angle characteristic referred to B and B’, and 7 that 
referred to B and B’, T(p’, B) = T(B’, B), (65.1) 
on account of the central symmetry of K. In other words, the 
functional forms of the old and new angle characteristics (referred 
to the appropriate base-points) are the same, since there is nothing 
to distinguish K (with B, B’ as base-points) from K (with B, B’ 
as base-points). Note that 

B= Poos-asiny, =, (65.2) 
with analogous equations for the primed direction cosines. If 


Q, Q’ are the feet of the normals drawn from B, B’ on to & (see 
Fig. 2.3), one has corresponding points 0, 0’; and one then has 


the relation T = T+0'0'+00, (65.3) 
regarded as being between optical distances. It is not difficult to 
confirmthat = 6Q = q[fsiny +.a(cos—1)], 
OF atpreingsatey apf 4 
There follows the identity 
T(f’ cosy —a’'siny, y’, Bcosy—asiny, y) = TP’, y’, By) 
+q'[P’ siny+a’(cosy—1)]+g[Psiny+a(cosy—1)], (65.5) 


valid for all values of y%. This must hold, in particular, for in- 
finitesimal values of y: 


T(p'—pa', y', B—yo, v) = TB Y's BLY) + (TB +98) Y, 


eT oT _,,, 
whence —a ap op = q'f'+qf. (65.6) 
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It will have been noticed that, whereas we might have considered 
arbitrary rotations about C, we confined ourselves to rotations 
about a line through C. This is, in fact sufficient, as long as we also 
make use of the symmetry of K (relative to .~). This we do now, by 
taking T to depend on 8’, B only through the usual rotational in- 


variants £,77, ¢: T =TE,7,9, (65.7) 
and then (65.6) leads to the two equations 


20 OE gO ee : fo eg (65.8) 
Vi — q> on act = q; 5. 


where, ofcourse, «’ =(1—£)#, a =(1-€)8. (65.9) 


The equations (65.8) are most easily solved by introducing in place 
of 7 the new independent variable 


x= 1-(1-E)R(r-CR-F (= 1-(o' + Bh + yy’). (65-10) 
Then T = G(x) + q(t Sh +g — Oh, (65.11) 


and here only a single function of one argument remains unknown; 
so that only one new constant enters into the aberration coefficients 
of order 2n—1, given those of the lower orders. 

Now that we have obtained the required result, a short-cut to its 
derivation is plain. Thus initially one takes both base-points to 
coincide with C. Then the corresponding angle characteristic 
‘T(®’,®) must be invariant under arbitrary rotations. The only 
invariant function of a’, £’,y’,a, 6, y (which transform like the 
Cartesian coordinates themselves) is aa’+/f’+ yy’, bearing in 
mind that a?+ 62+ y? = «24+ 8%+y"%=1, always. Thus 'T is 
merely a function of x, and going over to the base-points to which T 
refers, (65.11) follows at once. 

We note in passing that 


x = 4(E — 27 + €) + O(4). (65.12) 
If we now take B, B’ to coincide with O,, Oo as usual, we have 

g =1-m, q=1-—1/m, (65.13) 
since the central plane and both (paraxial) pupils coincide when 


s= 1. Writing G(x) = Got+Gixt Gay? +..., (65.14) 
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(65.11) is in harmony with (14.35) provided 
G,=1. (65.15) 
We note in passing that when the base-points are equidistant 


from C the angle characteristic satisfies the condition of reversi- 
bility; but K need not be reversible. 


66. The third- and higher-order aberrations 


It is instructive to write down the values of the primary aberration 
coefficients. Replacing 2G, by the more convenient symbol k we 
find directly from (65.11), together with (65.10) and (65.14), 


that £2 = LR(E — 27) + £)2 + mE? — 2F6 + m2]. (66.1) 


On the right we now introduce &, 4, ¢ according to (15.12), and a 
little manipulation then gives, with c = 4(s—m)-, 


Py = (1 —m)? [R(t —m)? + ml, 

Po = —4¢(1 —m) (1-5) [k(1 —m)? +m], 

Ps = 2e(1—m) [R(x —m) (1 -s)°+ (m—s%)], 
Pa = 4e(1—s)?[k(1—m)? +m], 

bs = —4e(1—3) [R(1—m) (1 -s)?-+(m—5?)], 
Pe = c[R(1 —5)*+ (m —s*)?]. 


From these a number of interesting conclusions may be drawn 
straight away. Thus (i) for a central stop (s = 1), coma, astigmatism 
and distortion vanish; (ii) when, with m + 1, the spherical aber- 
ration vanishes, so do coma and astigmatism; (iii) if spherical 
aberration is corrected for magnification m it is also corrected at 
magnification 1/m; (iv) the invariant %, = 2(s—m)?(2p,—p,) of 
Section 50 has the value —1 (i.e. —a3 = focal length of K). In 
the primary domain, where the aberrations associated with the pupil 
planes may beignored as far as the actual displacement is concerned, 
(i) is obvious enough, for every ray through C passes through K 
undeviated. (ii) implies that K will, to this order, produce a sharp 
image on a surface %*’ whose curvature is, in view of (19.15), 


4a'(s—m) ps = —1. (66.3) 
(Recall that (19.15) related to a unit of length such that d’ = 1; 


(66.2) 
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and the factor (s—m)-! on the right of (19.24) is not included in 
t) here.) (66.3) reflects the fact that in the absence of spherical 
aberration, K produces a sharp image of a spherical object surface 
#* of radius qg, the image surface %*’ being a sphere of radius —q’, 
the minus sign meaning that it is concave towards C. This conclu- 
sion is obvious, since the central symmetry of K now extends also 
to %* and .%*’, We shall shortly consider questions of sharp im- 
agery, as well as generalizations to all orders of the various results 
just arrived at. 

For the sake of simplicity we shall henceforth usually take the 
stop to be central (s = 1). This choice corresponds, in any event, to 
the situation most often encountered in practice. Other values of 
s may, if desired, then be accommodated by using the results of 
Section 44. From (65.11), 


(os) 1 aes _ 
r= > |Gxt+(‘)@e+ae], (66.4) 
where &, 7), € are yet to be replaced by &, 77, ¢, using (15.12), with 
s = 1. Write “ _— 

X=T+ x OE: 6), (66.5) 
where t = 4(E—279+€), and d™ is homogeneous of degree n in &, €. 
ohen 19) = Gyt? + 2G,76® + G® —35(q'E3 + g&3), 

1) = Gy! + 3G,7°¢ + G,[27d® + (6®)?] (66.6) 
+ $® —755(q'E + 964), 
and so on. 
We consider the special case m = —1. Then, in terms of , 7, ¢, 


T=H8, GP =F, $= HE+ aa (66.7) 


PO = pea(E? + 266+ 9° +0") 9, 
and so on. As a matter of fact it is possible to show that 
(n+ 1) 6 = Hon 1) (E+E) 4 
— fal —2) (B+ 2EC— 4k +) GO, (66.8) 


from which it follows incidentally that ¢™ has 7? as a factor for 
every 7, since ¢® and ¢® have this factor. Bearing in mind that 
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m = —1 it follows that ¢™ is an even function of y, consistently with 
the fact that K satisfies the condition of reversibility. In particular 
(59.5) will here also be valid. From (66.6) and (66.7) one has, in 
particular, 


WS = $G309 + 3Gaby? — aba (E> + 36°C + 4b" + 360" + 49°C + C9). 
(66.9) 


According to (66.2) the coefficient of primary spherical aberration 
vanishes when G, = 4; we now see that the secondary coefficient 
vanishes when G; = zy. These results may be very easily general- 
ized to all orders. Under the present circumstances we need only 
set £€ > 46,7 > 14, €> 4, since only the coefficient of €” is required, 


and then T(E,0,0) = G(4E) + 4(1 — 46). (66.10) 
Thus, using the notation of equation (45.12), S(&) will be zero 
provided G(x) = const. — 2(4—2y)2. (66.11) 


67. The displacement 


We proceed to consider the displacement e’, reverting to the vari- 
ables £, 7, ¢ for the time being. Write 


G(x) = dG/dy, T =(a’B-af')6. (67.1) 
Then, from (65.11), 
y=(T+qB)/e’, y= (U0 —g8)/a. (67.2) 


The coordinates y,, y, of the points of intersection of rays with the 
pupil planes follow from this by setting g = q’ = 0; e.g. in the exit 


il ' 
<a Ye = Tia’. (67:3) 
From (67.2) €’ = [(G'/a’) —(B/a)] [(ma’ —2) G +4], (67.4) 
since q’ = — mq. Taking (67.3) into account, one concludes that 
€y/€: = Yel Be (67.5) 


for all rays. It follows immediately that (when the stop is central) 
all barred effective aberration coefficients vanish (cf. equation (23.1)), 


ua) = 0; (67.6) 


II BIT 
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so that the total sagittal comatic asymmetry K,, vanishes (cf. equation 
(19.22)). Further, an ideal (i.e. a plane, undistorted) image is unattain- 
able, since otherwise one would have to have 


G = (1—m)/(~—mea’) (67.7) 


for all rays. This means that G would have to be independent of 7, 
which is impossible outside the paraxial region. 


68. Spherical aberration 

Let it be assumed for the time being that m + o. Then axial rays are 
distinguished by the property that y = o. Using the second member 
of (67.2), (67.4) then reduces to 


e' = g(@—mB")/a’. (68.2) 
If spherical aberration of all orders be absent, one must have 
B =m’, (68.2) 


which means that the sine condition 1s automatically fulfilled. 'Thus 
absence of spherical aberration entazls that the total circular comatic 
displacement be zero. This must, of course, be so, in view of the 
result stated just after (67.6), bearing Section 31 in mind. 

(66.11) may now be generalized to arbitrary values of m(+ 1). 
When S(é) = 0, equation (67.7) must hold for axial rays. For these 
one may set y = y’ = 0, so that the following equation results: 


G(1 —aa' — BB’) = (1—m) (a —me’), (68.3) 

where 
a= [1—m1-a)}t, B= (1-22), f’ =(1-2")h. (68.4) 
If one writes y for the argument of the function G, the right-hand 


member of (68.3) may be expressed as a function of x, and this leads 
to the equation 


G(x) = £(1—m) [(1 —m)? + 2my]-4. (68.5) 


Here the upper and lower sign are to be chosen according as 
m<torm> 1; for 


a’ = (1—m—x)[(1—-m)? + 2my]-4, (68.6) 


and the condition that this must be positive determines whether 
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one has to choose the positive or the negative square root here. In 
(68.5) the choice of the positive square root is understood. Thus 


— G(x) = m1 —m) [(1 —m)? + amy}. eo) 


When m = —1 this reduces correctly to (66.11). When m =o 
some of the preceding results require modification. However, 
as m->o (68.7) breaks down solely on account of the coordinate 
independent term (1—m)?/m. We simply subtract it out, for this 
process corresponds merely to the choice of a finitely situated 
anterior base-point. Then (68.7) reduces to 


G(x) =x (68.8) 
in the limit m = o. 


69. The displacement when spherical aberration is absent 


In the absence of spherical aberration the angle characteristic is 
fully known, on account of (65.11) and (68.7). The actual displace- 
ment e’ is thus determined. We consider it only in the special, but 
important, case of the object being at infinity, so that m = 0; and 
then one has a situation of great simplicity. To begin with, taking 
the anterior base-point at E, so that g = 0, we have 


T = const.+a’ —(aa' + Bp’ + yy’) (69.1) 

then, from (67.4) and (68.8), 
e’ =[(8'/a’) ~—(B/a)] (1 -&). (69.2) 
Again, from (67.3), Ye = (#'B—aB')/a’, (69.3) 
so that (69.2) becomes €’ = yz (: -3) : (69.4) 


Since £/a = h’, y/x =0, one thus has, on introducing the usual 
polar coordinates in the exit pupil, 


e’ = p[1 —(1+h’2)8] x 


cos @ 
sin @. 


(69.5) 


(69.5) represents one of those rare occasions on which the exact 
displacement can be exhibited in closed form, as a function of the 
variables which enter into the series (23.1) defining the effective 


II-2 
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aberration coefficients. At any rate, the effective coefficients of all 
types of coma of all orders are zero. 

It will not come amiss to contemplate for the moment the conse- 
quences which would arise if instead of defining effective coefli- 
cients relative to y, and h’ we were to define them relative to y, 
and h’; so that in place of p and 6 we have polar coordinates, p,, 
0,, say, in the entrance pupil. Then one has 


e! = y(a—1)/0’, (69.6) 


in place of (69.4). Evidently we need to obtain an expression for 
a/c’ in terms of y, and h’, and this can be done starting from 


a’B — ap’ = ay,. (69.7) 
Eventually there comes 
e’ = p,[t—(1+h2)-4] (1 — 2)" {p,h’ cos 0, 
—[(x —p3) (1 +h’) + p3h’? cos? O,]3} x a 6, 


sin 0,. (69.8) 


The new effective comatic coefficients are now not zero. Indeed, 
the comatic displacement is 
' oo Paes 1+ cos 20, 
(€Jeom = BH — pH L(x +H) H] | 


sin20, . 


(69.9) 


The dependence upon @, is the same for all orders, and the image 
patch is circular. (‘The expression on the right of (69.9) is of course 
O(5).) Going over from p to p, (= ap/a’), the set of rays selected by 
the condition p, = constant is different from the set which has p 
constant. In short, one must guard against attaching basic signifi- 
cance to results of a kind which depend fundamentally upon how 
one happens to split up the family of rays from a given point of the 
object into sets corresponding to this or that variable having a con- 
stant value. In the end it is only the sum total of these sets that 
counts, and it is relevant to recall that even in the absence of 
vignetting the family of rays whose passage through K is assured 
is characterized neither by Pp = Pmax, NOY by Py = (P1)max» Dut rather 
by p, = (Ps)max» Where p, is a radial coordinate in the plane of the 
stop. 
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70. On the existence of curved conjugate surfaces 


It was already noted in Section 67 that an ideal image is unattain- 
able. However, we may inquire whether there may not exist pairs 
of conjugate surfaces. To have such a pair at magnification m, 
spherical aberration must certainly be absent for this magnifica- 
tion; so that G(x) must have the form given by (68.7). When this is 
the case, it follows from the central symmetry of K that the spherical 
surface %* of radius |q|, will have as conjugate image surface 
#£*’ the spherical surface of radius |q'|, both having C as centre. 

One might at first sight be inclined to conclude that this is the 
only possibility. This is, however, not so, as reference to the con- 
dition of reversibility shows at once. We already saw in Section 63 
that zf a reversible system has a pair of conjugate surfaces at magni- 
fication m then it has another such pair at magnification 1/m. 
This conclusion evidently rests only upon the requirement that the 
condition of reversibility be satisfied—whether the system is in 
fact reversible or not is irrelevant. However, every concentric system 
satisfies the condition of reversibility ; from which we infer that there 
is another pair of conjugate spheres, as described above, but with 
q and q’ calculated for magnification 1/m. 

If this conclusion is not to be in conflict with the work of Section 
68, G(x) must remain unaffected when m is replaced by 1/m, 
since otherwise spherical aberration would be present at magni- 
fication 1/m, and one could not attain a sharp image. It is easily 
confirmed that G(y) has, indeed, the required property. Further, 
reference to the discussion following upon equation (63.14) shows 
that all possibilities as regards pairs of conjugate surfaces are now 
exhausted, bearing in mind that m > 1/m is the only substitution 
which leaves G invariant. 

In the course of the preceding investigation it was understood 
throughout that K satisfied the condition of not being telescopic. 
In the contrary case the situation is entirely different, for there 
then exists the possibility that every point of the object space has a 
sharp image. Let it be supposed, then, that K is telescopic. 

We adopt W,(f", y’, x, y, 2) as the most appropriate characteristic 
function. It will be noted that its dependence on x is now being 
explicitly contemplated. We further refer all points to a coordinate 
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basis both origins of which coincide with the centre C; and it is 
convenient to take this point to be also the location of the posterior 
base-point B’. Because of our general assumption, every point O 
of the normal plane .% through O,(x, 0, 0) must have a sharp image, 
the coordinates of which shall be X’, Y’. Thus there must exist 
functions K(¢, x), D(¢, x), x’(«) such that 


X'’=x'+K, Y'=yD, (70.1) 
x’ being the distance between B’ and the point conjugate to Oy. 
(In the notation of Section 65, x = —q, x’ = q’.) We therefore must 
have agit KS = yD, 
which splits up into the two equations 
20'Wy = x'+K, W,, =—D. (70.2) 


However, since the system is telescopic, 8’ must vanish when ® 
does so, which implies that W, must be independent of ¢. (70.2) 
then shows that the same is true of D and K. Since K(o, x) = 0 we 
see that K is zero, whilst D can depend at most upon x: 


D(G, x) = m(x), (70.3) 
say. It further follows from (70.2) that 
W, = -—a'x'—my +k, (70.4) 


where k can depend at most upon x. Pursuing now an argument 
which runs parallel to that following upon equation (65.11), 
central symmetry entails that W, can only be a function of 


a'x+Byt+y's and x*?+y?+2?, 
these being invariant under arbitrary rotations about C. Consider- 
ing (70.4) in the light of this conclusion we infer that 
x’ = mx. (70.5) 
At the same time consistency of this result with the linearity of the 


relation (14.36) requires that m be a constant. The last step in the 
argument uses the differential equation 


oW,\2  (aW,\? ()* 7 
(=) +(Z) + ye = 1. (70.6) 
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Inserting (70.4) (with (70.5)) in this, there comes 

k=const., m=1. (70.7) 

Thusfinally, W,= +(a’x+ f’y+y’'s)+const. (70.8) 


Since x can be chosen at will, we conclude that every point of the 
object space has a well-defined image point. Moreover, bearing in 
mind that |x| = |x’|, it is possible for any part of the object space to 
have a sharply defined image geometrically similar to it. However, 
such imagery is somewhat trivial in as far as the (reduced) magnifica- 
tion has of necessity the value unity. The universal validity of the 
last statement may be established under quite general circum- 
stances. 


71. On the absolutely invariant aberrations 


The angle characteristic of a concentric system being known to 
within a single function of one argument, it is of interest to examine 
the absolute invariants e,,y_, considered in general circumstances in 
Sections 49 and 50. However, since it is convenient to retain the 
variables £,77, €, we consider, in place of (49.6), the equivalent rela- 


- yy. = ANGEMy). (71.1) 


G(x) is the (2n — 1)th-order part of G(y); and the remaining terms 
of T are in any event annihilated by A. Evidently we may replace 
Gx) by G(x) in (71.1), provided all derivatives are evaluated at 
— =7j = € =o. Upon recalling (65.10) we see that 


x= E (i) (MGB Da. (gx.2) 
Write £m for the coefficient of (€¢)? in the expansion of the last 
factor on the right, so that incidentally 


2p >m (71.3) 


for all Bym which do not vanish identically. By straightforward use 
of the binomial expansion one finds that 


ma SGP os 


which is somewhat clumsy, in as far as the validity of (71.3) is not 
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obvious by inspection. At any rate, from (71.4) or otherwise, it is not 
difficult to construct Table 6.1 for the values of £,,,, for selected 
small values of p and m. 


TABLE 6.1 
m 

cor ——---'--r'''V[''[O Oooo 
po 1 2 3 4 5 6 
° I ° ° ° ° ° ° 
I ° —} 4 ° ° ° ° 
2 ° —i —ve —v6 3 ° ° 
1 _1 1 1 _5 5 

3 ° — 256 —128 —6é — 32 32. 7Teé 


(71.2) now becomes 


the dots indicating all terms which are eventually annihilated by 


AN, (§ =7] = § = 0). The remaining steps are quite similar to those 
of Section 50, and one finally arrives at the result 


aS aay n\(N—r)! 
= N! — 7)" o2N-2r 
A4n-1 + oa Pa T) 2 r! (n—2r)! PN—rn—2r G,,. 
(71.6) 


Table 6.1 is sufficiently extensive to evaluate a, a, and ,, from the 
last equation, and it turns out that 
Gz=—I, =—(14+2G,), oy =—9(1+2G_+2G,). (71-7) 
Remarkably, the secondary and tertiary coefficients G3, G, are 
absent from a,, and the coefficients G,, G;, Gg, which relate to 
orders 7,9 and 11, do not enter into the eleventh-order invariant a). 
Nevertheless, the fact remains that the G,, do enter into the a4y_}. 
If desired, they may be expressed in terms of the coefficients of 
spherical aberration; e.g. when s = 1 and m = 0, one has just 


Gz = 3k = 4p. = Oy, (71.8) 
in view of (66.2) and (19.6). The effect of the presence of primary 
spherical aberration on the value of a, need therefore be by no means 


negligible. However, when spherical aberration of all orders is 
absent, &y_4, reduces to 


Oy = —[(2N-—3)(2N—5)...5.3-1]?. (71.9) 
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Problems 


P.6(i). Cana reversible system form a sharp image of a paraboloid? 
If so, what is the nature of the image? 


P.6(ii). A reversible system transforms an hyperboloid into an 
hyperboloid. Their meriodonal sections have asymptotes the 
angles between which are 2 and 27’ respectively. Show that 


tan = sin '(m? — sin? ’)-4. 
P.6 (iii). ‘The stop of a reversible system is so situated that s = 1/m. 


Consider the condition of reversibility from the point of view of the 
point characteristic. In particular, recover equation (56.12) exactly. 


P.6(iv). The form of Y, given by equation (63.11) certainly must 
entail the absence of spherical aberration. This function does not, 
however, have the familiar form —(1-+£)? when 7 = € = 0, apart 
from an irrelevant additive constant. Resolve the apparent contra- 
diction. 


P.6(v). One often requires the homogeneous polynomials ¢,,(£, £) 
defined by 


(1-1-0 = EH 0. 
Show that they satisfy the recurrence relation 


(2+) Pnsa = (4-4) (E+ 6) bn —(n—2) Elbn 1 (n> 1). 


CHAPTER 7 


THE SEMI-SYMMETRIC SYSTEM 


72. Definition of the semi-symmetric system. Generic 
form of the characteristic functions 


Having considered the symmetric system at great length, ending 
up with the highest possible degree of symmetry, as represented by 
the concentric system, we move now in the other direction, as it 
were, and drop the second condition which formed part of the 
definition of the general symmetric system in Section 12. Accord- 
ingly, a system is called semi-symmetric (more precisely, r-semi- 
symmetric) if it has an axis of symmetry, but zo plane of symmetry 
containing the axis. In other words K will now have a built-in 
screw-sense; see also the end of this section. 

In the optical field a physical realization of semi-symmetry is 
likely to be rather artificial. We may, however, with advantage 
include in the term ‘optical system’ any image-forming device, no 
matter whether the image is formed by means of light or of particles. 
In that case an excellent example of a semi-symmetric system 1s 
furnished by the electron microscope, provided focusing is achieved, 
at least in part, by magnetic fields. 

Choosing a coordinate basis for the time being exactly as in the 
symmetric case, and assuming regularity of K with respect to it, 
we know already from the discussion of Section 13 that the point 
characteristic V must now depend explicitly upon the skew- 
symmetric rotational invariant 7 = 2’y— zy’, that is to say, V is to 
be written as a power series in the four variables &, 9, ¢, 7, rather 
than merely &, 4, ¢. On the other hand, wherever 7? occurs it can be 
removed by replacing it by £€—7?. Thus the point characteristic of 
a regular semi-symmetric system has the generic form 


V(E9, 67) =V*(E, 0,6) +704, 9, ©), (72.1) 

where V* and V* are power series in &, 7, €. 
An analogous situation prevails with respect to the other charac- 
teristic functions. In the case of the angle characteristic, for instance, 
T(E,9, 6,7) = T*(E,9, 6) +7T*(, 9, ¢), (72.2) 
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where &, 7, € are the rotational invariants (15.10), whilst the skew- 
symmetric invariant 7 is defined by 


T = (sy'—y) (mp’ — 8) —(my' — y)(sh'- P) 
= O2fty — Fy h (72.3) 


s and m have their usual significance, for we shall see that the 
paraxial behaviour of the semi-symmetric system is entirely 
equivalent to that of the symmetric system. When &, 7, € occur 
as arguments of T', one replaces 7 at the same time by 7, where 


T= p'y— By’, (72.4) 


and then (72.2) can be taken over by merely supplying all the 
variables with bars. However, note that, since 


T = (s—m)?7, (72.5) 


the functions 7'*(, 7, €) and T*(E,7,f) for which one should really 
use distinct functional symbols, differ by a factor s—m. 

It must be remarked that the ‘optical medium’ of any semi- 
symmetric system must be anisotropic; that is to say, somewhere 
within K the refractive index at any point must be dependent upon 
the direction of the ray—passing through this point—which is being 
considered. To see this, one need only reflect that, were the 
medium isotropic everywhere, a ray initially meridional would 
certainly have to be meridional throughout K; a conclusion which 
follows from the elementary laws of refraction in isotropic systems. 
The fact that the anisotropy here reveals itself only in the explicit 
dependence of the characteristic function upon the skew-symmetric 
rotational invariant 7 is due to our continued adherence to the 
agreement that in the object space and in the image space— 
especially in the latter—the refractive index is constant in the kind 
of system under consideration. This entails that the formal complica- 
tions arising from the need to distinguish between the direction of 
the wave-normal on the one hand, and the ray (i.e. the direction 
of energy transport) on the other, are absent in the region in 
which the image is being considered. Further details appear in 
Chapter 11, particularly in Section 114. 
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73. The paraxial region. Adapted coordinate basis 


Let the point characteristic V be referred to base-points O, and 
B’, where B’ may, in the first instance, be chosen arbitrarily, so 
long as it is not conjugate to Oy. Paraxially we write, in analogy 
MHD Ym ay tbh E tha + Hat het (73-1) 
so that, in principle, we now have four paraxial constants, instead 
of three as previously. We shall, however, see in a moment that 
k, is effectively redundant. 

To study the consequences of (73.1) as it stands, it is advisable 
somewhat to extend the notation first introduced just after equa- 
tions (14.4) in the following way. Given any two-component 
quantity, say U(= (U,, U,)) one adjoins to it the quantity 


U* =(-U,, U,). 
If U be regarded as a Euclidean two-vector, U* has the same length 
as U, but is orthogonal to it. Then, from (6.1), 
B’=kyy’+hoy+hyy*, B=—kyy’—kgyt+hyy*. (73-2) 


The coordinates of the point of intersection of a ray 2 with the 
normal plane x’ = d’ are therefore 


Y’ =(1+d'k,)y'+d' (kay +ky*), (73-3) 


cf. equation (14.8). All rays through O therefore intersect the plane 
which has d’ = — 1/k, in just one point O’ the coordinates of which 


are 
Y’=d'(ycosy—zsiny), Z' =d'(ysiny+zcosp), (73-4) 
with d’ = d'(kR-+R3)4, tan yy = hy/Po. 


This means that in the paraxial domain K produces a sharp and 
undistorted image in the plane .¥’ given by d’ = — 1/k, of an object 
lying in the normal plane.¥ through Oy. However, this image is now 
turned as a whole through an angle 7 about the axis of K relative 
to the object, in the sense of equations (73.4). ‘This conclusion in- 
forms us that we have proceeded in a clumsy way, and we go on to 
rectify this state of affairs. 
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To this end, having started with the usual coordinate basis as 
above, rotate the axes in the image space through an angle  accord- 


ing to yy cosy’ + F'* siny. (73-5) 
Asa result of this 


E->£, y->yceosp—rsiny, €>6, 1r>ysinw+rcosy, 
whence 
V > ayo +4hk,€+(k, cosy +k,sin wv) 4+ 4h36 
+ (ky cos wy —k, sin w) 7. 
If we now choose y so that tany = k,/k,, this reduces to 
V = ay tdhyEt+kynt+thks6, (73.6) 


where we have written k, in place of (k3 + k2)#, with the understand- 
ing that in (73.6) kg is the coefficient of 7 when yr has just that value 
which causes 7 to disappear from the paraxial point characteristic; 
consistently with the fact that £, 7, € in (73.6) of course also refer 
to this situation. 

In short, we can always choose a coordinate basis (distinguished 
from that used hitherto only by a rotation of the ¥’-, 7’-axes) such 
that in the paraxial region V takes the usual form (73.6); and we call 
it the adapted coordinate basis. Alternatively we shall often simply 
speak of adapted coordinates. In the following investigation of semi- 
symmetric systems the use of adapted coordinates will be under- 
stood throughout. 

The manifold results of Section 14 can now happily be taken over 
as they stand. One small point of terminology, however, remains to 
be examined. In the context of the symmetric system the coordinate 
basis was always such that the planes 7 = 0 and 3’ = o were in fact 
one and the same plane; and we called it the meridional plane. 
Now, however, unless = 0 the planes ¥ = 0 and 2’ = o are dis- 
tinct. The best we can do, if we wish to retain something like the 
usual terminology, is to take ‘the meridional plane’ to consist of 
two distinct ‘sheets’. They contain a common line, but their nor- 
mals make an angle y with each other. Even so, outside the paraxial 
region we cannot simply speak of ‘meridional rays’. It is true that 
if the initial ray is meridional the final ray will intersect the axis in 
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the image space, or else is parallel to the axis there. This may be 
inferred directly from the fact that the invariant j, defined by (8.6), 
and whose existence is guaranteed by the axial symmetry of K, 
vanishes under the present conditions. It does not follow, however, 
that the final ray is also meridional. 


74. Remark on the ideal characteristic function 


We have already considered the ideal point characteristic twice, 
namely in Sections 7 and 15, and it should be clear that (15.6) 
applies here also. Were we not using adapted coordinates, however, 


we should have to set Y’ = y, cosy—yi siny (74.1) 


in calculating the optical distance between D’ and O’ as in Section 
15, with the result that we should then have to write 
u=€-—2ycosw+C—arsiny, (74.2) 


where y, now takes the place of y in T. 

Suppose now that we regard the image as ‘ideal’ provided it is 
merely plane and sharp. In the case of the symmetric system this 
meant that there existed a function D(€) such that Y’ = y, D(¢). 
Now, however, the line Oj O’ need not make the same angle with the 
meridional plane as O)O does. To accommodate this feature we 
therefore have the weaker condition 


Y’ = y,D(f)+ yi D), (74-3) 

where Db) =14+4,C+..., DO) =d,+.... (74.4) 
Then, taking d’ = 1 as usual, 

Y= 9(0)—[1 +E 29D + (2+ D9) 27D. (74.5) 


In short, we now have two ‘distortion functions’, i.e. D and Dd. 


475. The normal and skew aberration functions 


The aberration function is, as usual, the difference between the 
actual and the ideal characteristic functions. We continue to work 
with the point characteristic, so that the aberration function is 
v = V—¥, Recalling (72.1), we write 


v= v* + 78%, (75-1) 
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and callv*, 6* thenormal and skew aberration functions respectively. 
The terminology is convenient even though it is 76* rather than 
6* which induces the ‘skew aberrations’. Order by order, 


fee) 
vt = > vin), of = 
n=2 


ae (75-2) 


iMs 


where v* and 5*“ are homogeneous polynomials, of degree n and 
nm —1 respectively, in &, 7, ¢. Explicitly, 


nm fb 


orm — > he En yh-v C’, 

“L=0 v=0 

pia (75-3) 
ptm — > > On Enh lye oY, 

B=0r=0 


and these define the (characteristic) normal and skew aberration 
coefficients. There is no need to attach the superscript * to these, for 
the uv) cannot possibly refer to a formal power series for v™ itself: 
the latter depends on four variables, and so its power series has 
coefficients which, in principle, must have three subscripts. 

vu” does not enter into the aberrations associated with the con- 
jugate planes. and.%’. One must, however, not fall into the error of 
supposing—from force of habit—that the sameis true of the corre- 
sponding coefficient 4, 1. Accordingly one has in (2m—1)th- 
order jn(n+1) skew coefficients in addition to the 4n(n+3) 
normal coefficients, or (n+ 2) in all. 


76. The third-order displacement 


We shall now investigate the third-order displacement at some 
length, since it already brings out very clearly the effects of the 
presence of the skew aberration function. The quadratic poly- 
nomial (19.1) represents v*®, and in addition we have 


Bt = B,E+ fo9 + Ps. (76.1) 
We note that if v has no terms of order less than 2n — 1, then 
Con—1 = 20M y' toMy, + oMy#*, (76.2) 
which reduces to (17.7) when 6* = 0. Thus, whenz = 2, 
€3 = (4016 + 2f29 + 2p36+ 2p,7) y' + (Po + 2psyt+PsO+ Pet) Y1 
+(bib+ poy + psS)yi- (76.3) 
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Since K is axially symmetric we can always arrange 2, to be zero, 
as in Section 19. Accordingly we can use (19.3) and (19.4) in (76.3), 
walls: 7 = ph’ sind. (76.4) 
Further, we retain (19.6) since the skew aberration function merely 
causes the addition of new terms to those previously encountered in 
equations (19.7). For the sake of uniformity we also define 


O2= py, G3= py, GF; = pz. (76.5) 
(76.3) then becomes 


€5y = 0 p* cos 0 + p*h'[o,(2 + cos 20) + 6, sin 20] 


+ ph’? [(303+ 04) cos6+ 6, sin 6]+ 05h, ae 
€3, = 0p? sin 0 + p?h’[o, sin 20 + &,(2 — cos 20)| we 


+ ph’[(o,+ 0,)sin9 + &,cos 0]+ 6;h’%. 


The number of types of aberrations is not affected by the presence 
of 5*, since types are essentially characterized by their dependence 
upon p and h’. Nevertheless, each type is now subdivided into two, 
of which one is normal, the other skew. They may be considered 
separately or jointly: here we pursue the second course. 

The terms varying as p* evidently represent spherical aberration, 
and since they are not affected by the presence of *® they need not 
be discussed again. The partial displacement which varies as 
ph’ may be written 


9 =al[(2+ cos oa 


6. 
& = a[sin 26 + k(2 —cos 20). (76-7) 


This corresponds exactly to (19.12), whilst the additional para- 
meter k = 6,/03. To begin with, we observe that when o, =0 
we have exactly the usual primary comatic flare, except that its axis of 
symmetry lies along the 8-axis instead of pointing along the 9-axis, 
as it does when 6, = o. Apart from this rotation of the flare as a 
whole (which must not be confused with the rotation of the normal 
flare accounted for by the rotation of the ideal image relative to the 
object) normal and skew primary coma therefore have essentially 
the same character. As a matter of fact it is not difficult to show that, 
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to within a rotation, one has exactly the usual comatic flare when 
normal and skew coma are taken jointly. To this end, rotate the 
local 7-, 2-axes through an angle % = arctan k: 
'¥ = J cosP—$* sin J; (76.8) 
and thereafter replace the angle 0 by '0+4y. Then 
‘) = pth'(o2 + 62)t (2+ cos 2'6), 
ey ete } (76.9) 
& = pth'(o3 + Gt sin 2’0. 


The state of affairs represented by (76.9) is illustrated in Fig. 7.1, 
in which the comatic displacement is of course very greatly ex- 
aggerated. 


I 
y 
O 
Oo z 


Fig. 7.1 


Next we come to the terms of (76.6) varying as ph’®. The partial 
displacement may be considered in an out-of-focus plane shifted 
relatively to .%’ by an amount x = kh’, exactly as in Section 19, so 


— 9 = [03+ 0,—k) cos0+ 6, sin 6] ph’?, ! 


&= [(o3+ o,—k) sin? + G3 cos 0} ph’. (76.10) 


The displacement is symmetric with respect to J’ and has essenti- 
ally the usual character of curvature of field. The image is in general 
elliptical, but the axes of the ellipse do not lie along the 9-, $-axes, 
being rotated through an angle arctan [63/(203+ o,—)] relative to 


12 BIT 
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them. Again, two mutually perpendicular focal lines exist, namely 
in the image planes for which & has one or other of the values 


k = 2(034 04) + (034+ 62). (76.11) 


The skew coefficient &, evidently represents a form of astigmatism, 
since its presence leads to focal lines even when o3 = a, = o. The 
image patch is circular when k = 203+ 0,, irrespectively of the 
value of &5. 

Finally we come to distortion, for which the partial displacement 


j= 05h, 2 = 65h’. (76.12) 


The presence of two coefficients here reflects just the presence of 
the two distortion functions in the point characteristic of equation 
(74.5). From the latter one verifies easily that o, = d,, &; = dy, so 
that (76.12) is of course consistent with the more general equation 


| (74.3). 


44. Higher-order displacements 


We consider very briefly the displacement of order 2% —1 induced 
by the aberration function of that order. The situation is clearly 
very similar to that dealt with in Section 21. In view of this, a few 
general remarks will suffice. Accordingly we make the transition to 
the variables p,0,h’ in (75.3). Equations (21.2) may be retained, 
Papney A = Y (v™ cos +5, , sind) cos’-16, (77.1) 
the summation going over values of 4 and v such that w+v =A, 
with the understanding that when the value of the second subscript 
of any coefficient exceeds that of the first, the coefficient in question 
is to be taken as zero. From this the partial displacements follow 
by means of (21.4). For instance, with regard to circular coma, the 
‘position is very much the same as that encountered in third order. 
In fact, one has exactly the (2n —1)th-order flare familiar from the 
symmetric case (cf. equations (21.9)), except that it is rotated bodily 
through an angle y™ = arctan (3%/v) relative to the f-axis. When 
several orders are superposed one has a similar state of affairs with 
regard to any particular zone. However, one must bear in mind that 
y™ depends upon n. Therefore, under conditions such that a 
zonal family of circles possesses an envelope, this will, in general, 
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no longer be a pair of straight lines, but two curves of a more or less 
complicated character. Again, spherical aberration is unaffected by 
the presence of the skew aberration function, whilst the effects of 
the latter on curvature of field and distortion are amply illustrated 
by (76.10) and (76.12). 

The additional types which enter as one goes from order three to 
order five are, of course, oblique spherical aberration and elliptical 
coma. The first of these now involves three, rather than two, 
coefficients, so that the generic shapes of the usual zonal curves 
form a two-parameter family. It is therefore hardly feasible, nor is it 
useful, to display representative examples after the fashion of Fig. 
4.4. Elliptical coma is, in a sense, even more complicated, in as far as 
it is jointly governed by four coefficients, of which two are normal 
and two are skew. The only truly simple feature is here that every 
zonal curve is an ellipse, though in special circumstances this may 
degenerate into a straight line. In fact, absorbing a factor p2h’3 in 
each of the aberration coefficients, set 


Onna = HA—b), Of 1 »-2 = b, 
Oe 1n—3 = HG+b), Be n-2 = G—2b). 
Then 9 = acos 20+ dsin20+(a+b), 
&= cbs Bee 


As before, one may have a pair of real tangents to the set of curves 
generated by varying p; and when they exist they area pair of straight 
lines through J’. A typical case is illustrated by Fig. 7.2. The line 
of centres does not lie along either coordinate axis, nor does it 
coincide with the line bisecting the angle between the tangents. 
The image will be triangular when ab = ab. 

From the particular cases we have discussed it will be seen that, 
qualitatively speaking, the presence of the skew aberration function 
&* has two principal effects on the zonal curves, when these are 
compared with the zonal curves when d* = o: (i) it displaces their 
centres away from the f-axis, and (ii) rotates them bodily about their 
centres. 


(77.2) 


I2-2 
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Fig. 7.2 


48. Effective coefficients. Ancillary functions 


The effective aberration coefficients are, as always, defined to be 
the coefficients of the power series for the actual displacement e’, 
regarded as a function of appropriate variables. ‘These we again take 
to be y’ and y,. However, in place of the rather simple polynomial 
(23.1) for the displacement of order 27 — 1, the corresponding poly- 
nomial here is a good deal more complicated, in that it involves 
five, instead of only two, kinds of effective coefficients. Indeed, 


n-1 4 


€on—1 = ; (u@y’ +70 y, 7 ain yi) Ere ree 


1 
B=0 v= 
n—-2 pb 
+S Cupy' + apy eretgter, (78.1) 


The number of effective coefficients of order 2n—1 is n(2n+1), 
so that there must be m(2u+1)—n(n+2) =n(n—1) identities 
between them (cf. the end of Section 23). 

In fifth order the state of affairs is fortunately rather simple, since 
(in an obvious notation) the sz and sz are exactly given by (23.3); 
in other words, the coefficients 6, do not occur in them. (Note, 
however, that in seventh order, for instance, they will turn up in 
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t% and if.) We therefore need to write down only the following 
additional relations 


Sf = 5443p, 
SF = 5,— py + 3B, 
SS = 5g +3), + 3Ds, 
ST = 54— Po, 
53 = 5,+4b2—4Bs, 


5g =SetsPs> 


*s¥ = 45,4+3p1, *5* = §,-3f,4-3B., 
*s¥ = 25,-4),+2p., *5F = eee 
#s¥ = 2554p, —po+ ps; *5¥ = §5+3Bo—Bs. (78.2) 


One could go on like this to higher orders, but one would evi- 
dently be confronted with a confusing and rather lengthy array of 
relations. It is preferable to restore some semblance of simplicity 
by going over to one of the ancillary functions previously introduced 
in Section 37, i.e. to the spherical point characteristic, or better 
still, the modified spherical point characteristic. Owing to the use 
of adapted coordinates the basic equations relating to V* are un- 
affected by the presence of the skew aberrations. Consequently, 
let the vu and 5™) be understood to be the coefficients of the modi- 
fied spherical pant characteristic aberration function, and further, 
let the ui, 7%, ... be interpreted in the sense of equation (78.1), but 
with y’ referring to the ideal wavefront, and not to the plane of the 
exit pupil. Then the equations 


um) = 2(n — pt) Ds um = (u —p+ 1) Va ys a = o> | 


78. 
ul) = (n—p+1)d™, * = (u— p+1)oM 1, J ee 


are exact for n = 2 and n = 3, and are likely to give a very close 
approximation to the actual effective coefficients when n > 3. 

Finally, it may be noted that the considerations involving the 
wavefront and the various functions associated with it, as set out in 
Sections 38 and 39, are not affected by the presence of the skew 
aberration function. 
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79. Shifts of stop and object 


We investigate briefly the effects of shifts of the object and of the 
stop. To this end we naturally use the angle characteristic, though, 
if one is not content with the pseudo-displacement or some ana- 
logous approximation to e’, the complex relationship between the 
effective and the characteristic coefficients always lurks threaten- 
ingly in the background. 

Proceeding exactly as in Section 46, equation (45.3) is the basic 
relationship from which all the required results flow, ¢(£, 7, ¢,7) now 
replacing t(&, 7, ¢), of course. (‘This simple state of affairs is due to 
our use of adapted coordinates, coupled with the fact that the angle 
y in equations (73.4) in no way depends on the values of s or m.) 
In addition to (46.4) we have 

a 1—pq 
(@—p)(1-9)" 
Now observe that §,7,¢€ on the one hand and 7 on the other trans- 
form independently of each other, from which it follows that (45.3) 
splits up into two separate identities. Thus we have an equation 
exactly like (45.3) for the normal aberration function ¢*, whilst the 
skew aberration functioné* enters only into the identity 


H*(E, }, é) — Tt*(é, qs ¢), 


or, on account of (79.1), 


##(6.9,0) = Gane; r*(E,9, 0). (79.2) 


(79-1) 


It is convenient to write 1/d for the initial factor on the right; see 
equations (46.1). Then one finds very easily that the new primary 
skew coefficients are given by 


Ay = d-c*(p, + pho +P"Bs); 
Ba = d*bc[2gp, + (1+ 9) Bat 2pba), (79-3) 
bs = dV°(G°B, + 9B. ds). 
For the normal coefficients one has, of course, (46.5) as before. 
We do not need to write down the equations for the secondary 


skew coefficients, for a moment’s reflection shows that they result 
from equations (46.5) by the following formal prescription: (1) set 
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Ji = 9; (ii) replace every p, by &,; and (iii) replace every p, by 
ds. A similar situation prevails with respect to the tertiary skew 
coefficients, for which the required equations are to be read off from 
(46.9) using the formal prescriptions (i) j, > 0, (ii) 5, >#,, (iii) 
s,> di.,. Evidently our patience in writing out many long sets of 
equations in the context of the symmetric system is paying divi- 
dends now. 


80. On invariant aberrations 


The ideas of Sections 46-52 remain entirely relevant to the semi- 
symmetric system. Indeed everything that was said before applies 
quite unchanged to the normal coefficients. It therefore only re- 
mains to investigate invariant and semi-invariant skew aberrations. 
Equations (79.3) already show at once that, in the third-order 
domain, there are two semi-invariants, but no absolute invariants; 
the former being the s-invariant (s—m)’ p, and the m-invariant 
(s—m)? p,. The interesting feature which emerges here is that the 
skew semi-invariants are of comatic type, whereas the normal semi- 
invariants were of astigmatic type. In particular when primary 
coma is entirely skew it cannot be removed by a shift of the stop 
(recall that now & = (s—m) p,). 

Again, one sees immediately from (46.7) that the expression 
(s —m)? (253 —§,) will be an absolute invariant; and this relates to 
secondary elliptical coma. Now suppose that 2§,—§5, vanishes for 
some position of the object and of the stop. This implies just that 
the quantity 6 in (77.2)—with n = 3—1is zero. If therefore we sup- 
pose, for the sake of illustration, that the normal aberrations have 
been removed, then the flare due to secondary skew elliptical 
coma will be triangular, with its base parallel to the -axis, the 
angle at J’ being a right angle; and it will have this shape for all 
values of s and m. 

The method of generators of Section 49 may be extended— 
somewhat trivially perhaps—by adjoining to the previous set (49.1- 
3) the generator T and its adjoint T, defined by 


0 


Cree (80.1) 


T=(s-m)z., - 


so that aT, (80.2) 
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Then the various skew invariants and semi-invariants result from 
the application to ¢ of operators which are products of 7 and 
appropriate powers of A, Sand M. In particular, there is one abso- 
lute skew invariant of order 4N+1(N = 1,2,...), and it is given 


by Sanya = TANCND = (s—m) ANF*ON+D, (80.3) 


The semi-invariants and conditional invariants are likewise given 
by expressions closely resembling those obtained previously. 


8x. Sine-relation and cosine-conditions 


Since the ideal point characteristic %) of the semi-symmetric 
system K is formally the same as that of the symmetric system, 
relative to an adapted coordinate basis, it follows at once that when 
there exists a pair of perfect conjugate planes ¥ and .%’ all rays 
through the axial points of these satisfy the sine-relation (27.3). 
On the other hand the situation is more complex with regard to 
the analogue of the sine-condition of Section 28, as will now be set 
out. 

In the first place, the aberration function (26.1)—which included 
no terms varying non-linearly with /’—must now be supplemented 
by a skew term, i.e. here 


v = S(é)+9C(E) + 7C(E). (81.1) 
Then it is easily confirmed that for axial rays (y, = ©) 
8’ —m-18 = (28+C)y' —Cy’*. (81.2) 


Now, even when y = 0, 7’ will in general fail to vanish, and it is 
not good enough to retain the sine-condition in the form 


sin d’ —m-1sin = 0, 


ie. m(B'2+ 2) = B+ (81.3) 
Taking spherical aberration to be absent for the sake of orientation, 
(81.3) merely implies C24C242C/w =o, (81.4) 


a result which is unnecessarily weak. Evidently the sine-condition 
should now be expanded into the cosine-conditions even in the 
context of axial rays (cf. Section 34), 1.e. 


B’—m™B = 0, (81.5) 
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for when S = 0, these entail 
C=o0 and C=o. (81.6) 


In the absence of spherical aberration, satisfaction of the cosine- 
conditions therefore implies the absence of both normal and skew 
circular coma. 

In the presence of spherical aberration one naturally takes in 
place of (81.5) the modified cosine-conditions 


8’ —m™B =o, (81.7) 


where, in the notation of Section 29, f’,y’ are the cosines of the 
angles which the line D’O, makes with the y’- and 3’-axes. 


82. The two offences against the cosine-conditions 


It is of considerable interest to examine whether here again there 
exists a close relationship between the exact total circular comatic 
displacement of certain rays and the extent to which the cosine- 
conditions are not satisfied by axial rays of the same aperture. To 
do so we first require the appropriate generalization of equation 
(26.5), to include the effects of the presence of the term 7C in (81.1). 
To the various quantities defined by equations (26.3) we naturally 


aqjomn O=-2w'C, R=uC; (82.1) 
and then one finds after some rather tedious manipulations that 
 e' = {(1-JP) + P[(P*R—O) 9+ (P2R-O) ry’ 
—(1-JR)y,+JRy#. (82.2) 
For the transverse and longitudinal spherical aberration one has 
o=p(1—JP), d=~—1+1/JP, (82.3) 


as before. On the other hand, upon defining meridional and 
sagittal rays to be rays through O—itself in the meridional plane 
—which have 6 = 0 or 7, and 6 = + 4m respectively, we see that 
€, in general does not vanish for tangential rays. We thus have no 
option but to contemplate, in place of x,, two quantities, x and & 
say, defined as 

kK=6,(0=47), K=6(0=0). (82.4) 
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At the same time we write K = «/h' and K = &/h’, so that 


K=JR-1, K=JR. (82.5) 
Now wR/m = —Ry'+Ry’*, w' = —Py’, 
whence PB/m = Re' —Rp’*. (82.6) 


From the last pair of equations it follows that 


mR _ fp +yy' mR _ By'-yB (82 ) 
P prety? > P prety”? . 7 


It is convenient to introduce the abbreviation 

M = m(1+6)(B? +"), 

and then (82.3), (82.5) and (82.7) finally yield 
=-1+(Bh' +yy')/M, K=(fy'—yP’)/M. (82.8) 


The direction cosines on the right may be taken as relating to an 
arbitrary axial ray; indeed, K and K are obviously invariant under 
rotations. The closest resemblance to (31.5) is obtained by choos- 
ing it to be sagittal, for then y’ = o, and therefore 


_ B = Y 

Kem ateeayp? 8 =~ misaye — @29) 
It is natural to call K and K the normal and skew offence against 

the cosine-conditions respectively. When both vanish simultaneously 

the comatic zonal circle, corresponding to the value of p in question, 

passes through J’. 


83. Reversible systems 


The definition of reversibility given in Section 53 in the context 
of the symmetric system may be retained for the semi-symmetric 
system. One must, however, be clear about the implications of this 
as regards the screw-sense. Thus, consider some ray Z through K. 
The ‘reflected’ initial and final rays define a certain ray which will 
be the reflection of # as a whole only if the screw-sense of the re- 
flected system K* is the reverse of that of K. For example, when 2 
is, within K, a right-handed spiral, then its reflection Z* will be 
a left-handed spiral, viewed in each case from the object space; 
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and realizability of this by K* requires that the preferential direc- 
tion implied, at each point of the ray, by the screw-sense must have 
opposite signs for K and K* respectively. Physically, in the case 
of an electron—optical system, the direction of the magnetic field 
must be reversed. 

The reversal of the screw-sense implies that if T* = f(&,7, ¢7) 
is the angle characteristic of K referred to base-points Of, Oo, as 
before, then that of K* is T* = f(£,77,¢, —7). However, on going 
over from & to &*, £,7), €,7 go into ¢,7, &, —7; so that we now have 


the basic identity T*(E,7, 2,7) = T*(€,7, 8,7) (83.1) 


as the generalization of (53.1). The fact that adapted coordinates 
are being contemplated need cause no concern on account of the 
axial symmetry of K. 

It is immediately clear that (83.1) breaks up into two separate 
identities. The first of these involves only 7*; and it is just (54.2) 
with T in this replaced by T*. It follows that the work of Sections 
54-60 may be retained in its entirety, provided it is understood to 
concern the normal aberration function only. In particular we have 
the relations (56.5), (57.1), (59.5) governing the normal aberra- 
tion coefficients of the various orders. 

It remains to investigate the effects of reversibility on the skew 
coefficients, which are implied by the identity 


Su (&, UB ¢) = E*(E, q, ¢). (83.2) 


Since one does not have a problem here analogous to that of elimi- 
nating the unwanted normal coefficients ™, the device of intro- 
ducing the auxiliary variables L, M, N, defined by (55.1), is, strictly 
speaking, superfluous. However, as they are already available, we 
may as well use them, especially as they endow the work with a 
certain additional simplicity. Consequently the factors multiplying 


Derr N” and = L’Me-Nr-r-1 
in the polynomial ¢*™(u?Z + 2uvM+v?N, w(uL +v0M),w*L) must 
be equal to each other. We note here that all relations between the 


skew coefficients will be homogeneous. 
When x = 2 one gets just one relation, namely 


(u? —v*) p, + wup, + wpy = 0, (83.3) 
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where we suppose for the time being that m* + 1. In terms of the 
coefficients of equations (76.5) therefore 


(1 —s”) 6+ (1 —sm) 634+ (1 —m*) 6, = 0. (83.4) 


The two fifth-order relations may be read off from (56.3) and 
(56.10), since the polynomial to be considered is given by (56.2) if, 
in this, we set d* = o and replace p, by §,. Thus 


4(1 —32) uh, + (gu? 08) 5, + 20eo(S, +54) + 20%, = 0, 


(1 —s?) (u2 +. v?) §, + 038, + u2e0(S5 +-5,) + uws, + ws, = 0. 


bs) 


Since the degree of the polynomial #*™ is less by one than that 
of ¢*™ the number of relations between the skew coefficients of 
order 2n — 1 may be read off from (58.4), replacing 7 in this by m— 1, 
and increasing the number so obtained by 1, sinceZ™, ,_ need not 
be eliminated. Thus the number in question is 


4(n?—1) when nis odd, and j3n* whenniseven. (83.6) 


Altogether the number of relations between the n(n + 2) coefficients 
of order 2n—1 is therefore }(n—1)(n+2), leaving 4(2+1)(n+2) 
which are mutually independent. 

We have yet to consider the case m? = 1. It will suffice to take 
m = —1, and, at the same time s = 1. Proceeding as in Section 59 
it follows at once that then 


H™) = 0 when p—v ts odd. (83.7) 
In particular, when x = 2 or 3, 
pe=09; §,=5,=0. (83.8) 


In short, all skew aberration coefficients of the completely rever- 
sible semi-symmetric system which are of astigmatic type must 
vanish. This result is to be contrasted with (59.5) which expresses 
the vanishing of the normal comatic types. Note that (83.6) con- 
tinues to hold when m? = 1. 

Finally, contemplating the possibility of the achievement of 
sharp imagery by semi-symmetric systems, nothing emerges which 
goes essentially beyond what was found in Sections 62 and 63 for 
symmetric systems. The reason for this is quite simply as follows: 
that to obtain a sharp image all skew aberrations other than dis- 
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tortion must certainly vanish, bearing in mind that skew curvature 
of field is of astigmatic character. The reversibility condition then 
requires that the skew distortion function D be constant, and there- 
fore zero; cf. equations (74.3, 4). It follows that the presence of an 
inherent screw-sense in K must have no effect beyond rotating the 
image space rigidly through some angle; but such a rotation is 
irrelevant to surfaces of revolution .4*’. 


Problems 


P.7(i). Show that the line of centres (of the zonal circles) of the 
circular comatic flare produced by a semi-symmetric system is a 
straight line if, and only if, the ratio of the comatic functions C and 
C is a constant. (Spherical aberration of all orders is taken to be 
absent.) 


P.7 (ii). Given any two sets of values of the primary skew coeffi- 
cients, show that if the first relates to s and m then one can always 
determine $ and *® such that for these the primary skew coefficients 
take the values of the second set. 


P.7 (iii). Given T, show that, with o and p. defined by (24.2), 
e’ = (s—m)(26T,+pT,+p*T). 


P.7(iv). Find the explicit form of the absolute skew invariant 
&. ‘To what type of aberration does it relate? 


CHAPTER 8 


SYSTEMS WITH TRANSLATIONAL 
SYMMETRY, AND PLANE-SYMMETRIC 
SYSTEMS 


84. Definition of translationally symmetric systems with 
or without additional symmetries 


Let a coordinate basis of the kind familiar from the theory of sym- 
metric systems be chosen, so that the x’- and x-axes are collinear, 
and the remaining axes pair-wise parallel. Then a system K is 
translationally semi-symmetric if, upon translating it in a direction 
parallel to the Z-axis, the resulting system is indistinguishable from 
K. For the purposes of this definition the stop is not to be regarded 
as part of K. On the other hand, the object and image planes may be 
included, granted that these are to be taken normal to the x- and 
x’-axes. The fact that K cannot, of course, extend to infinity is 
immaterial. The situation corresponds exactly to that remarked 
upon in Section 64 in the context of the concentric system: a ray 
which passes through K both before and after a translation ‘does 
not know ’ that the system has been displaced. 

Consider now the translation of K through a distance s. Then, 
since 2’ and z are reduced, as usual, 2’ changes by an amount N’s, 
and z by Ns. Invariance requires that the relation 


V(y', 2’ + N's, y, 3+.Ns) = V(y',2',y, 2) 


must hold identically for all values of s. The generic form of the 
point characteristic is therefore 


V= V(y's ys 2’ — 2), (84.1) 


where z, = (N’/N)2z;so that V is a function of only three arguments. 

By way of example, a system which consists of a number of re- 

fracting cylinders having arbitrary normal cross-sections, and 

which are arbitrarily disposed relative to one another except that 

their generators must be parallel to the Z-axis, is certainly invariant 

under translations parallel to this axis. However, when K is thus 
[ 190 ] 
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constituted one sees immediately that the positive 7’- and 3- 
directions are not preferred over the negative 2’- and 2-directions. 
This will evidently always be so in the absence of anisotropies 
within K. We have a situation analogous to that remarked upon 
at the end of Section 72: that is to say, unless K was a purely 
electrostatic electron-optical instrument, the positive and negative 
%’- and 2-directions respectively were not equivalent. We call K 
translationally symmetric if it is translationally semi-symmetric and 
if the ‘meridional plane’ (2’ = 0, 7 = 0) is a plane of symmetry. 
This means that V must be invariant under the joint reversal of 
sign of 2’ and 2,. Since we impose the condition of regularity upon 
K as usual, this means that only even powers of z’—2, can appear 
in the power series for V; and this situation is reflected formally by 
van V = VLy's3 (2 —a) (84.2) 

If K is translationally symmetric and, in addition, the common 
plane y’ = 0, y = 0 is a plane of symmetry, then K will be called 
c-symmetric. This case is realized in practice when, in the example 
discussed previously, the boundaries of thenormal cross-sections are 
symmetric about the axis .~ of K, i.e. the common line defined by 
the x’- and #-axes; in particular they might be composed of arcs 
of circles whose centres lie upon . c-Symmetry requires that V 
be invariant under the simultaneous reversal of sign of y’ and y, 
and in place of (84.2) we then write 


V=V[y?, y' yuri (2 —21)7), (84.3) 
where y, = my. Here m is the magnification in the meridional sec- 
tion of K; see Section 85. From (84.2) it follows immediately that 


Y=vy (v= NIN’), (84.4) 
which is, once again, a special case of (8.5). 
It is natural to put 


f=", 1=I'Vy S=y¥E TH=(2'-M)%, — (84.5) 
although the &, 7, ¢, 7 here have meanings in no way related to those 
of the rotational invariants denoted previously by the same symbols. 
Nevertheless the notation is not inappropriate, since £, € and 7 may 
be regarded as the ‘elementary reflection invariants’ of the c-sym- 
metric system. 4 is not elementary, for 


n? = £¢; (84.6) 
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and its inclusion is dictated only by the need easily to accommodate 
the condition of regularity. Evidently we can write, in complete 
analogy with (72.1), 


V(E,9, 67) = VEE, Sr) +9V 46, 67). (84.7) 


However, there does not appear to be any general, interesting pro- 
perty of K which would be reflected in the absence of the aberration 
function corresponding to V*. 

It is possible, in principle, for a translationally semi-symmetric 
system to have the last symmetry introduced above; in which case 
we would call it c-semi-symmetric. Our purposes will, however, 
be sufficiently well served if we confine our discussion to the sim- 
plest case, i.e. that of the c-symmetric system. We may have addi- 
tional symmetries. Thus K may be reversible, i.e. have a normal 
plane of symmetry. Figure 6.1 may be used as an illustration, if 
the various curves be interpreted as being the boundaries of the 
normal cross-sections of the refracting elements. Again, K may be 
concentric, i.e. invariant under rotations about a line through .%, 
normal to the meridional plane. Such additional symmetries entail 
the use of appropriate formal devices to make them tractable, and 
we therefore consider them later on. On the other hand, it is hardly 
appropriate to deal with the c-symmetric system in the same detail 
as we did with the symmetric system; especially as the theory 
of the latter was given at such great length so that it might serve as a 
model for the treatment of systems having different basic sym- 
metries. 

One final point deserves comment. Suppose that a c-symmetric 
system is also invariant under translations in a direction parallel to 
the y-axis. Then, in view of (84.3), 


V = VU(y'—yy) 1)". (84.8) 


This is as far as one can go: we conclude that even invariance under 
translations along two (mutually perpendicular) directions does not 
imply the absence of anisotropy within K; for example, N might 
depend upon y?. Were K isotropic, N would be constant over every 
normal plane, and K therefore symmetric; and in (84.8) V would 
reduce to a function of (y’ —y,)?+(2’ — 2,)?. In short, when classes 
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of systems are defined generically by the symmetries they possess, 
one may always have internal anisotropy consistent with these 
symmetries. 


85. The paraxial region 


In the paraxial domain V reduces to 


V = const.+ $4, 6 +k,y + hg C+ Sky, (85.1) 

the base-points having been chosen arbitrarily for the moment. Then 
‘= khyy'+ho, = Re —2;), 

B ee 2V1 v a( 1) (85.2) 
—Blm = Ray’ +hsyy vy = Ay(2’ — 2). 


This is evidently a special case of (10.10), bearing in mind that a 
c-symmetric system has two planes of symmetry containing .°~7. 
In any such system the y- and z-directions ‘do not interact’ paraxi- 
ally, so to speak. One can therefore always make formal use of the 
work of Section 14, taking the y- and z-components of the equations 
there separately, with the appropriate transcription of the paraxial 
constants. In the present instance this transcription is 


i pe ky, mk, mks ; 
as ay Rg —> ha — a/v, Ralv?, (85.3) 


where B has now been taken to be Op, whilst B’ may be regarded 
as the axial point of the exit pupil. One then concludes immediately 
that K is characterized by two distinct focal lengths f, and f,, where 
ji is given exactly by the right-hand member of (14.15), whilst f, 
is infinite. The sagittal section of K is therefore telescopic, and we 
cannot use the angle characteristic. Of course, in the sagittal section 
the refracting elements are equivalent to a set of plane-parallel slabs. 
All rays from an object point O will pass through just one point I’ 
of the image plane only if K is such that k, = k,, and d’ is chosen to 
have the value — 1/k,. I’ is evidently virtual (i.e. the rays in the image 
space meet in J’ only upon being produced backwards). If k, + ky 
meridional rays all pass through one point of the plane which has 
d’ = —1/k,, and this point need not be virtual. The complete image, 
formed by all rays admitted by K, will be a line normal to the meri- 
dional plane: 
| Y=yy 2 = [(Ry— hy) 2’ +2 ]/hy. (85.4) 
Such line images will be studied more closely in Section 89. 


13 BIT 
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86. The aberration coefficients 


Let % be the object plane as usual, and let .7’ be the image plane 
meridionally conjugate to it, so that the location of 7’ is given by 
the value —1/k, of d’. For the present we take the view that we 
should like K to produce a sharp image in %’, the (reduced) magni- 
fications in the y’~ and 2’-directions being required to have constant 
values m and v- respectively. It is quite in order to adopt this 
definition of ideality even though we know that K cannot possibly 
produce an image of this kind unless k, = ky. 
By the usual argument therefore, 


V = g(0)-(1+ 6-2 tb4) +o, (86.r) 


where we have set d’ = 1, as often before. Note that g cannot de- 
pend on 2,, since when the imagery is perfect, V(O, O’) is a function 
of y, and z, only, but z, must of necessity occur together with 2’, as 
we have seen. If 9(f) = g9 + 4216+... it follows from (86.1) that 


V= gy—1-BE+9-Hr-g)o-Hr +4404). (86.2) 


On the other hand, we already know from the work of Section 85 
that 


V = const.—36-+9—3[x+(1/mf,)]6+ dhe +O(4), (86.3) 


by transcription of (14.30), applied to the meridional section of K. 
Comparison of the last two equations shows that g, = —1/mf,, 
ang o® = 4(ky+1)7. (86.4) 


The presence of the first-order aberration function of course simply 
reflects our previous conclusion that perfect imagery, in the sense 


understood above, requires that k, = k,(= —1);seealso Section 89. 
In harmony with (84.7) we now write 
v = v*(€, 6,7) +90*(€, ¢,7). (86.5) 


The coefficients v%, 5%) of the power series for v* and d* are the 
aberration coefficients of the c-symmetric system. When 7 = 1 there 
is only the one non-zero coefficient vf) = $(k,+1), in view of 
(86.4). As regards the coefficients of orders 3, 5, 7, ... (there are of 
course none of even order), we have no need to count them, for 
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the present state of affairs is precisely analogous to that encountered 
in Section 75. Thus, when m + o the number of aberration coefficients 
of order 2n—1 (n > 1) is n(n +2). In particular there are eight third- 
order coefficients. 

The explicit inclusion here of the condition m + 0 is necessitated 
by the fact that one obtains a different result when the object is at 
infinity, in which case the ideal image must also lie at infinity. This 
means, in turn, that K must be entirely telescopic, i.e. f; = 00 also. 
We then understand perfect imagery to mean that K transforms 
every set of mutually parallel initial rays into a set of mutually 


parallel final rays, ie. ; 
B= np, y =vy ee) 


for all rays, where 4 is a constant and v has its previous significance. 
We now go over to the mixed. characteristic W, base-points at E 
and E’, say. According to (84.4) the second member of (86.6) 
must hold for all translationally symmetric systems, whether 
perfect or not. Thus 0W,/0@z’ = vy, i.e. 


W, = v3'y +A(y’, A, Y); (86.7) 


where the function A is so far arbitrary. c-Invariance requires that 
its power series be solely in terms of the variables £ = yy’, 7 = y’£, 
€ = f*,7 = y*. If the planes at infinity be perfect the first member of 
(86.6), i.e. W/dy' = wf, together with (86.7), shows that 


A= py'B+8(60), (86.8) 


where g is arbitrary. We are thus led to write 
W, = 86,7) + my"B+v2'y +m, (86.9) 


where w, is the aberration function. 

To count the number of coefficients in this case, imagine w, to be 
written in a form analogous to (86.5). Then there are altogether 
3(1+ 1) (2 +2) coefficients w%}, of which, however, all those multi- 
plying powers of ¢ andr only are redundant, and there are m +1 of 
these, so that we are left with 4(n + 1) coefficients of this type. In 
addition there are $n(n+1) coefficients @Y),. Thus, finally, when 
m = 0 the number of aberration coefficients of order 2n —1 is n(n+1), 
i.e. fewer than when m + o. In particular, there are only six third- 
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order coefficients. It may be noted that the kind of system to which 
this result relates occasionally occurs in practice, namely as a 
periscopic instrument. 


87. The third-order displacement when k, = k, 


We first consider the third-order displacement under conditions 
such that the possibility of ideal imagery (in the sense defined above) 
exists in principle. This requires that k, = k,, and then we naturally 
contemplate the ideal image plane. In the usual way. 


ha = 29 %G+I%y Cg = 2(2/— 24) Op (87.1) 


where v is to include only the third-order terms. Although (86.5) 
was a useful way of writing v in the earlier context, there appears 
to be little point in adhering to it now, nor does there seem to be a 
preferential ordering of the various terms of v®. We therefore write 


V) = p 2+ poly + plot pat +ps9St+ pent +pi,6T +pst*. (87.2) 
Then 
€yg = 2y'(2p, E+ Poy + Poot PT) +I1 (DoS +PsF ee (87.3) 
Eig = 2(8’ — 81) (Pab + Poy + P76 + 2fs7)- 


If the aperture of the stop is rectangular, one can discuss ‘zonal 
curves’ corresponding to y’ being kept fixed (see the end of this 
section). 

A stop with a circular aperture will usually lie in the image space. 
We now definitely suppose this to be the case, and introduce the 
usual polar coordinates. At the same time, since such a stop does 
not partake of the translational symmetry of K, we cannot simply 
put z=o. It is convenient, therefore, also to introduce polar 
coordinates in the ideal image plane. Thus, altogether, we set 


y’ =pcos0, 2’=psin#, y,=h'cosy, 2,=h'siny, 
(87.4) 


so that h’ is the ‘ideal image height’, i.e. the perpendicular distance 
of J’ from the axis. Of course, h’ does not stand in a fixed ratio 
to the ‘object height’ h(= (y+ 2?)#) unless vm = + 1. (87.3) then 
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gives rise to the rather complicated expressions 


ef = [(4p c0s?0 + 2p,sin®9) cos 6] p+ [}(3P2-+P.) cos 
+ 3(3P2— pg) cos y cos 20 — 2p, sin y sin 20] p?h’ 
+ 2[(p3 cos? wv + p, sin 7) cos 0 — p,sin y cos wv 
x sin 0] ph’? + (p, cos? wv + p, cos w sin? w) h’8, 
€z3 = [(2p,cos"O + 4p, sin?9) sin 0] p? + [— (py + Ops) (87.5) 
x sin Ww —(p,— 6p.) sin ¥ cos 20 + p, cos ¥ sin 26] 
x pPh' + 2[ —pgcos yw sin yr cos 0+ (p, cos? wv 
+ 6p, sin? y) sin 6] ph’? — (2p, cos? y sin w 
+ 4p, sin? yr) h’3. 

To the four types of primary aberrations which appear here we 
attach the usual names and briefly discuss them in turn. Note that 
any particular coefficient in general now enters into the partial 
displacement corresponding to several types; so that the discussion 


of the separate partial displacements is inherently even more formal 
than usual. 


(i) Spherical aberration 


This is the only axial aberration, and it is now jointly governed by 
three coefficients, so that the state of affairs is, in general, far from 
simple. ‘The image patch is circular only when 2p, = p, = 2pz. 


(ii) Coma 
The partial displacement has the generic form 
J =acos20+bsinz0+a+bd, 2=ccos20+dsin20+b—c, 
(87.6) 
with an obvious interpretation of a, ...; e.g. 
a = $p7h'(3p2— Pe) cosy. 


For any fixed value of y one has a displacement generically of ex- 
actly the same kind as that due to higher-order elliptical coma in 
semi-symmetric systems (see (77.2)). One cannot therefore speak 
of ‘circular coma’ here, but rather of linear coma, as suggested in 
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Section 22. A more detailed discussion is superfluous. We merely 
remark in passing that, for general values of the coefficients, there 
is always a value of y such that the partial displacement, taken by 
itself, represents a triangular image; but a statement of this kind is 
somewhat misleading since the partial displacements represent- 
ing other types will, under these circumstances, certainly not be 
absent; cf., by contrast, the results of Section 88. 


(iii) Curvature of field 


We consider the image as usual in an out-of-focus plane shifted with 
respect to ¥%’ through a distance kh’. The partial displacement is 
then generically just that which obtained in the case of the (7-) 
semi-symmetric system, with the following formal identifications 


(7-10), p,) cost y+ (pa Ops) sin®* y+ 0 
—(p3— 3P7) cos” W — (py— 18g) sin? yf > oy, 

—2p,siny cos p > 6. (87.7) 
The discussion of curvature of field given in Section 76 may there- 
fore to a large extent be taken over into the present context. Pre- 
tending for the moment that the partial displacement can be discussed 
as if all the other types were absent, we observe that, naturally, 
the various secondary image surfaces will bear a more complicated 
character than was the case when axial symmetry obtained. The 
‘Petzval surface’ will serve as an example. Thus, point images result 
for all values of y& if pg = 0, ps = pz, and p, = 6p,. The surface on 
which these images are formed therefore has principal curvatures 
c, and c,, whose values are 


C,=4P3, Co= 4P4 (87.8) 
in a neighbourhood of the axis. In general one therefore has an 
ellipsoid or an hyperboloid in a neighbourhood of the axis. Yet, 
how can that be? If points such that y, = o have sharp images these 
must surely fall along a straight line, in view of the translational 
symmetry of K, so that the Petzval surface must be cylindrical. 
The paradox is only apparent, and further emphasizes the artificial 
nature of this analysis: to have in fact a sharp image, spherical 
aberration at least must be absent, and so certainly p, = 0; and then 
Cy = O, as required. 
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(iv) Distortion 

Both # and 2 are, in general, non-zero. Once again, this partial 
displacement has little more than formal significance, unless all 
other types are absent. Then one is left only with the terms governed 
by p;, and the displacement in the z-direction vanishes, inde- 
pendently of the value of y. 

In conclusion we remark that the occurrence of a particular 
coefficient (such as ,) in the several partial displacements is essen- 
tially a consequence of having a stop whose shape does not conform 
to the symmetries of K. Were the stop a slit (like K as such, 
unlimited, in principle, in the z-direction), z, could always be 
arranged to be zero. A zonal curve would then be taken to corre- 
spond toa fixed value of y’; and, recalling (87.3), the four ‘types’ of 
partial displacements would be governed by (i) p1, Pas Pg» (ii) Po» Pos 


(iii) D3, Pz, (iv) Ps, respectively. 


88. Digression on plane-symmetric systems 


The symmetric and c-symmetric systems, as well as those to be 
considered in some detail in the next chapter, all have one common 
property, called double plane-symmetry, which means the existence 
of two mutually perpendicular planes of symmetry, whose line of 
intersection defines a preferred axis , and so a preferred base- 
ray. We therefore digress for a moment to consider the doubly 
plane-symmetric system as such. We do so partly to lend added 
weight to the point of view that in certain respects the Hamiltonian 
theory is interesting not so much because it tells us what kind of 
imagery a given type of system can achieve, but rather what kind of 
imagery it cannot achieve. For this reason we prefer to be landed 
with a slight amount of repetitiveness inherent in separately treat- 
ing the more highly symmetric systems as we have done here, instead 
of regarding the systems in question as being doubly plane-sym- 
metric, but with additional symmetries. The confusion likely to be 
engendered by constantly having to investigate ‘special cases’ 
would surely be a disadvantage; and in any event translationally 
semi-symmetric systems, for example, would be excluded from 
the outset. 

In the context of the point characteristic, one is confronted with 
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the six ‘reflection invariants’ £, = y, 4, =9'y, G = y3, & = 2, 
No = 221, C = 24, where y, = my, 2, = m,z. Here m, and m, are 
the magnifications associated with the pair of conjugate normal 
planes ¥ and ¥' being contemplated (cf. (10.17)), and it is definitely 
supposed, for the sake of simplicity, that K does in fact produce a 
sharp image of a plane object in Y; cf. (10.16). Even powers of 4, 
and 7, are always to be eliminated by means of the identities 


Mi = 6:5, 93 = Sabo. 


The anterior base-point is taken as Op, whilst the posterior base- 
point B’ is chosen arbitrarily. When the stop lies in the image space 
one will naturally take B’ to be its axial point; when it does not, one 
has the situation that there is, in general, no sharply defined exit 
pupil, so that one does not then have such a ‘natural’ choice of the 
position of B’. 

The ideal image is taken to be sharp, with constant magnifications 
in the y’- and z’-directions. Accordingly, on setting d’ = 1, as 
usual, 

V = 9(b, &)—(1 +8: - 2 + + Se— 2a t e)t+v, (88.1) 


where v is the aberration function. There is no need to go through 
the paraxial theory once again. Indeed, if desired, one may simply 
draw upon the general equations of Section 14, adopting the y- 
components as they stand, and their z-components similarly, but 
with %,, k,,k, replaced by constants J, J,, /, respectively. One has, 
of course, two focal lengths f, and f,, and in (88.1) the function 
&(S1 €2) has the linear terms — €,/(2m, f,) — €2/(20m2 fz). 

The number of aberration coefficients which govern the displace- 
ment in #%’ may be counted easily as follows. We temporarily imagine 
v to be written in the generic form 


v= A+myB+ gC +m 92D, (88.2) 


where A, B, C, D are functions of £, €, &, ¢. ‘That there are no 
coefficients of even order goes without saying. With regard to those 
of order 2n—1, one has to add the number of coefficients in one 
polynomial of degree , two of degree m — 1, and one of degree n — 2, 
each in four variables. Further, the sum of these has to be reduced 
by the number of coefficients multiplying products of powers of ¢, 


SYSTEMS WITH TRANSLATIONAL SYMMETRY 201 


and ¢, alone, i.e. by +1. Accordingly, recalling a result stated 
shortly after equation (11.2), the total number of coefficients is 


CS) a ES ) ie ec _ ~(n-+1) = 9n(n-+1)(n+2). (88.3) 
3 3 3 

This will be seen to be in harmony with (11.5), if, in the latter, one 
replaces n by 2m —1 and then substracts m+ 1, to allow for the irre- 
levant coefficients. 

We may mention in passing a possible additional symmetry of 
the doubly plane-symmetric system which has not been hitherto 
referred to, since it is of a rather academic kind. The symmetry in 
question is invariance under rotations through go° about .. When 
this obtains one has to have 


V(Exs M1» Sa» Sa Yo» Se) = V(Eo, Nos So» $1» Dav Ca) (88.4) 


where, just for the moment, the use of variables y, z in place of 
¥y, %, 1s understood. It follows immediately that m, = m,, and 
Si = fo; so that paraxially K behaves as if it were an r-symmetric 
system. Outside the paraxial region the state of affairs is, of course, 
more general. Thus, in the third-order region one is still left with 
nine aberration coefficients, as compared with five in the symmetric 
case. 

Returning to the general doubly plane-symmetric system, the 
third-order displacement is given as usual by 


€, = dv®/dy’, (88.5) 
and we write uniformly 


Oo = py Ei + Pots So+ Pale + Padi + Psbi%et+ Petr be 
+P boo +Pob1 bi +Pob1 So +Pi0%1 Ne + Pir $1 be +Pisbo bo 
+P 1391S + P14 Se + Ps $1 92+ Pig Me be- (88.6) 
Then, making the substitutions (87.4), one gets the following 
partial displacements: 
(i) Spherical aberration 
3d = p*(4p, cos? 6 + 2p, sin? 0) cos ‘| 


ae (38.7) 
& = p3(2p,. cos? 0+ 4p, sin? 0) sin 6; 
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(ii) Coma 
S = pPh'[3(3Pa4t Pe) cos W + 3(3pa — Pg) Cos f cos 20 
. + Ds sayen 26), | (88.8) 
& = p*h'[3(Ps + 3p;) sin b + 3(Ps— 37) sin f cos 20 
+p,cosy¥ sin 20]; 


(iii) Curvature of field 

9 = ph’*{[2(p, cos? w+ py sin? yw) — k] cos 0 + py) sin cos yr sin 9}, 

8 = ph’*{[2(p,, cos? + p,,sin? w) —k] sin 8 + py) sin y cos ¢ cos 9}, 
(88.9) 


where k is the usual constant relating to the position of the out-of- 
focus image plane; and 


(iv) Distortion 
J = h(pyg cos? + py, sin? p) cos p, 
§ = h'3(p,,cos* w+ py,sin? y) siny. 


We now have the important conclusion that the displacement is 
just like that represented by (87.5), except that the interaction 
between the various types is entirely removed; this being made pos- 
sible by the greater number of independent aberration coefficients 
which the merely doubly plane-symmetric system possesses. It is 
therefore possible now to have a Petzval surface of a far less formal 
character than that encountered in Section 87. In short, equations 
(88.8-10) enable us to see quite clearly how the introduction of 
additional symmetries generically affects the partial displacements, 
and the interactions between them. The displacement (19.5) is 
of course a very special case in which one again has no interaction 
between the different types; this feature being the result of 
having a stop which conforms to the symmetries of the system, as 
remarked at the end of Section 87. 

We could now go on to discuss such problems as finding the gen- 
eral conditions imposed upon K by the condition of reversibility. 
Indeed we would then find that there are five relations between the 
sixteen primary coefficients, and, in general, they cannot all be 


(88.10) 
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homogeneous simultaneously. However, as remarked earlier, we 
prefer to deal with such questions in the more specialized situations 
as we come to them. 

When a system has only a single plane of symmetry, the ‘meri- 
dional plane’ z = z’ = 0, say, the state of affairs is a good deal 
more complicated, mainly because there will be aberrations of even 
as well as of odd orders. One is thus confronted with a great multi- 
plicity of aberration coefficients. The lowest (non-parabasal) order 
which necessarily has to be considered is the second, and this in- 
volves already eight distinct coefficients, in general. Consequently 
the distinction between the effective and the characteristic third- 
order coefficients is no longer of the trivial kind encountered earlier. 
The base-ray &, will naturally be taken to be in the plane of sym- 
metry, a choice which was already implicit in our writing the 
equation of the meridional plane 7 = z’ = 0. Beyond this, no 
natural choice of Z, suggests itself, since one has here no preferred 


axis. Then V = V(y',y, 22, 2'2, 2), (88.11) 


there being no linear terms in the power series for V. 

We could now carry through the usual programme to investigate 
the general properties of singly plane-symmetric systems. This 
would be little more than an exercise following rather closely the 
lines of the work of the first part of this section. A few remarks of a 
more or less general kind will therefore suffice. To this end we 
imagine K to have been so designed that it forms a sharp parabasal 
image, the conditions being those considered in Section 10, shortly 
after equation (10.13). Then the ideal point characteristic is 


Yo = a(y, 2") —[d? + (9 — my)? + (2 —mgz)?}8, (88.12) 
and the aberration function 


v= > Un, (88.13) 


contains no terms quadratic in the ray-coordinates. One finds that 


coefficients of order = 


the number of aberration e (n+1)(n+2)(n+6) (neven) 
n(> 1) 


qa (w+1)("+3)(n+5) (nodd), 
(88.14) 
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the irrelevant coefficients, i.e. those multiplying products of powers 
of y and z only, being not counted, as usual. 
For the sake of orientation, consider the second-order displace- 


iiees €!, = dv./dy’. (88.15) 
We write 
Vg = Ty +Tgy y+ gy? + My’ s+ Mey’ R's 


+g y 2+, yz'2+71,y2'2. (88.16) 


Proceeding as after equation (88.6), and absorbing powers of m, 
and mz, in the 7r,, we get 


€ya = 2P"[(37 + M4) + (3771 — 774) COS 26] + ph’ (271, cos 4 


x cos +7, sin O sin yr) + h’2(71g cos? y + 7, sin? YP), 


; (88.17) 
x9 = 7114p" sin 20 + ph'(7,, cos O sin yw +77, sin @ cos 7p) 


+7,h'* sin y cos p. 


We have encountered no displacement of this kind before. 
Evidently we have three types of aberrations, jointly governed by 
eight coefficients. The type independent of h’ quite closely re- 
sembles elliptical coma, whilst the aberration governed by 7, 7; 
and 7, is a form of curvature of field, with the usual implications. 
The terms independent of p clearly represent distortion. The 
upshot of all this is that the classification of aberration types of 
Section 22 can be generalized by allowing z to be half-odd integral, 
and interchanging the terms ‘coma’ and ‘astigmatism’ when it 
is so. In particular, we have here second-order constant (i.e. zero 
degree) coma, linear astigmatism, and quadratic coma respectively, 
in the order in which these were just discussed. 


89. The displacement when k, + k,. Line images 


Reverting to the c-symmetric system, we know that when k, + ky, 
a sharp image can never be obtained. It is instructive, however, 
to consider the image in the plane ¥;,, whose axial point is the meri- 
dional focus, i.e. the point in which meridional rays from O inter- 
sect the axis. This means that we take d’ = — 1/k, (= 1). In view of 
(86.4) the aberration function appropriate to the third-order dis- 


placement is v = Uky+1)T +2, (89.1) 
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where v® is given by (87.2). It is, however, advantageous to con- 
template, in place of v, a modified aberration function $—recall 
the end of Section 36—defined to all orders by the equation 


V = g(C)—(1 + §—29 +6 — kyr — 28)}. (89.2) 


Evidently 6 = O(4), unlike a, since all paraxial terms are correctly 
accounted for. One confirms easily that 


8 = 4H +h) TE-—an+4+40—R)TL (89:3) 
cy = 08 dy! — h(a +1) (9-91) +(5), 
cz = 26/02! + (hy +1) (2 24) (1+ $497) + 0(5). 


o may be thought of as written like (87.2), with the p, replaced 
by ,- Then the generic third-order displacement is essentially that 
encountered in Section 87, but with an additional first-order dis- 
placement in the 2’-direction, of amount proportional to (2’— 2z;). 
We conclude that if K is to produce a (straight) line image of a 
point source in ¥, then this line must lie in.¥,,, and, not surpris- 
ingly, it must be normal to the meridional plane. 

If every point of ¥ is to have a line image, ¢; must be a constant, 
depending on ¥,, for all values of y’ and y,. To the present order 
this requires that 


Pi = bp = Pg = 0, 2py = — Pg = $Ra(Ra +1), (89.5) 
whilst f;, 6, and , are not restricted. It is instructive to consider 
the conditions under which K will form a line image of any point 
O of ¥, without any restrictions upon the orders of aberrations to 
be included. Then we require 


Then 
(89.4) 


ey = vs[D(E)—1], (89.6) 
where D is some function such that D(o) = 1. (89.6) implies the 
eduanons 2V,+a'=0, V,—a'D =o, (89.7) 
whence, in the first place, 

V = V(é—2Dn, ¢,7). (89.8) 


Then, with —-2Dy7 = a, 
B' = 2(y'—Dy)V,, yy! = 2(2'—2,)V, (89.9) 
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and using these in the first member of (89.7), there comes 
4(1+-€—-2Dy4+ D%)V24+4V? =1. (89.10) 

This may be further simplified by means of a change of variables 

according to $= (1+£—-2D9+D%)t, t= 74, 

so that finally V2+V3?=1. (89.11) 


All solutions of this equation which can be written as power series 
in s and ¢ with only even powers of ¢ present, represent the point 
characteristic of c-symmetric systems with the required property. 
The set of solutions in question cannot be exhibited explicitly. 
Probably the most suitable general way of dealing with the problem 

is to substitute ‘ 
V= a, Un(s) 2” (89.12) 

a 


into (89.11), the functions v,,(s) then being found iteratively as the 
solutions of ordinary differential equations. However, for our 
purposes it will suffice to consider merely a specific example. 

To this end we observe first that V = const. —(s?+ #8 satisfies 
(89.11), but this solution is useless since it cannot accommodate 
the condition k, + k,. However, the formally trivial addition of a 
‘constant’ 5 to s leaves (89.1) unaffected. We are thus led to the 


a V = a(0)— {b+ (1 +u)tP +7}, (89.13) 
where u = €—2Dy+D*%, (89.14) 


and the bar over the g serves to distinguish this function from that 
in (89.2). b may be a function of ¢, i.¢. 


B(E) = —[1+(1/Ry)]+416+.-.5 (89.15) 


where the constant term is so determined that (89.12) accommodates 
the paraxial form (86.3) of V. The displacement corresponding to 
this particular choice of the solution of (89.11) is then 


ef =0, € =)(2’—2,)[b+(1 +48]. (89.16) 


y 
Further, comparing (89.2) with (89.13), a somewhat tedious cal- 
culation shows that 


BO = dyn +4h,b Cr+ hh, (1 +k) (E—2y +O— bear) Tt. (89.17) 
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Thus 
Pi = bp = pg =0, 2f,= —py = $h,(1+h,), \ 


: : J (89.18) 
Bs=4y, By = $hy(2b,+hy +1), Bp= P(t +h,);] 


and these are indeed in agreement with (89.5). 


go. Remark on the sine- or cosine-relations 


Suppose that ¥ and .¥’ are perfect in the restricted sense described 
at the beginning of Section 86. Then v = 0 in (86.1), and it follows 
at once that for all rays through O, 


b'—flm=0, y'—vy =o, (90.1) 
the second of which holds in any event, as a consequence of trans- 
lational symmetry. We have, in fact, the usual cosine-relations, 
which reduce to the sine-relation (27.3) when m happens to have 
the value y-!. The validity of (90.1) only requires that the conjugate 
planes be perfect in a sufficiently small neighbourhood of O). 

One can now attempt to undertake an investigation designed to 
yield ‘offences against the cosine-conditions’, after the fashion of 
Sections 31 and 82. It turns out, however, that no useful results 
emerge. This state of affairs is essentially due to the absence of 
axial symmetry. Whereas when the latter obtains, there is only one 
(2n — 1)th-order coefficient of spherical aberration and one of linear 
coma, one has, in the case of c-symmetry, n+1 coefficients of 
spherical aberration which, together with an additional n coeffi- 
cients, govern linear coma. Indeed, the two types of aberrations are 
induced by an aberration function of the generic form 


v= S(6,7)+9C(8, 7); (90.2) 


where the fact that 7 involves z, should be bornein mind. Therefore, 
even in the total absence of spherical aberration, one is still left with 


a function of two variables. When S = 0 one finds that 
90.3 
é, = X'wylC + 2£C,+.2(1 +7) C,] 


for sufficiently small y,, where w = (1 + +7)*; whilst when yy =O 
Blm— pf =y'C. (90.4) 
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No axial ray trace can tell us anything about the values of the 
derivatives of the function C. It is therefore useless to consider 
anything but the displacement of ‘sagittal rays’, i.e. rays having 
6 = +42, and for these 


ey = Vii tp?)EC0,p?), 6 =0. (90.5) 
Even the usefulness of this result is illusory, for one cannot obtain 
the value of C(o, p*) from (90.4), since € = o requires that the factor 
y’ on the right be zero. Without numerical interpolation one thus 
cannot get anywhere in this fashion, and in practice this would be a 
senseless procedure. 


g1. Reversibility 


Let the c-symmetric system K be reversible, i.e. possess a normal 
plane of symmetry @ whose axial point is C, as usual. For the sake 
of a little variety we investigate the situation sometimes encountered 
in practice, namely when K is entirely telescopic (f; = f, = 00) and 
the object is at infinity. Now v =1, and, recalling (86.6), K is 


perfect if B= mp (01.1) 


for all rays, the condition y’ = y being satisfied in any case, on 
account of the translational symmetry of K. 

Let the point characteristic be taken with respect to a convenient 
pair of finitely situated base-points which are disposed symmetric- 
ally about C. (91.1) requires that % depend on y’ and y only through 
the combination (wy’ —y)®. We therefore write, in the usual way, 


V = g(we—2uyt6,7)+2. (91.2) 


Reversibility entails that V be invariant under the mutual inter- 
change of £ and ¢. On momentarily confining our attention to the 
paraxial terms of V we see that we must have 


w=, (91.3) 
cf. (56.13). From the point of view of the ‘displacement’, now de- 
fined , ’ 

i e = p’— nf, (91-4) 


the previous choice of reflection invariants is not convenient, and 
we replace them according to 


bag, q = ve—, €=E-ayt+G T=T, (91.5) 
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v now being regarded as a function of these variables. It is obvious 
that e’ involves the derivatives of v with respect to £ and # only, 
whence we again arrive at the conclusion stated at the end of Section 


86. 
Now, upon mutually interchanging £ and ¢, 


E>E-2up+, f>—grul, $>+& t#>F. (91.6) 
The reversibility of K then requires that, identically, 


v(8, 9, 6 4) = of -—2uq+ & —9+n6, 67). (91-7) 


When é and ¢ are mutually interchanged, —2y#+ € and & inter- 
change place, whilst — 9, € and # are unaffected. The primary 
aberration function must therefore have the form 


v® = p(E—2u9+ O)E+p(E—n9)€+p(E—up)t, (91.8) 


the irrelevant terms being omitted, as usual. We thus recognize 
that, when the object is at infinity, the telescopic c-symmetric 
reversible system has only three independent primary aberration 
coefficients. Of these, two can be regarded as governing the imagery 
in the meridional section alone. 


92. Concentricity 


It remains to consider the concentric c-symmetric system, as defined 
in Section 84. A single refracting circular cylinder is of this kind, so 
that the problem is not merely academic. On the other hand, the 
theory has its own peculiar difficulties, for the following reasons. 
One would naturally wish to use the angle characteristic T, as we 
did in Section 64, since invariance under rotations—here about a 
line—is then so easily dealt with. However, K being always tele- 
scopic in the sagittal section, the use of T is forbidden. The point 
characteristic is not easily handled either, since the derivation of its 
generic form involves the use of the differential equations (3.6). 
Essentially the same remark also applies to the mixed characteristics 
W, and W,. In short, we are led to contemplate the use of one of the 
‘strange’ mixed characteristics first mentioned at the end of Section 
5. A suitable choice is W = T+2'y', (92.1) 
regarded as a function of f’, 2’, 8, y. It is assumed that f, + 00. 


14 BIT 
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From simple geometrical considerations it is apparent that all 
characteristic functions, with the sole exception of V, will not them- 
selves be invariant under translations. The translational symmetry 
of K is therefore taken into account by requiring W to be such that 
the condition (84.4) be satisfied; and this will be the case provided 


W = va'y + ACB’, BY) (92.2) 
where A is some function of its arguments. 

We have not yet adopted any particular base-points. From the 
point of view of the object and image planes there is no particular 
choice which is outstandingly advantageous. The same is not true, 
however, in the context of rotations, that is to say, the rotations 
about the line through the centre C, normal to the meridional plane. 
In fact, recalling the work following equation (65.11) it is natural 
to take both B and B’ to coincide with C; for then W can depend 
on f’ and £ solely through the function ae’ + £6’. Thus now 


W = v2'y + S(aa' + BB, y’), (92.3) 
which, incidentally, is seen to be in harmony with (92.1). The fact 


that y’ occurs in a’ need cause us no worry since we can simply 
replace it by vy. If 


&= 6", 9=fP, = —, T=, (92-4) 

we therefore have the required generic result 
W(B',2', B,Y) = vs'y + G(x, 7), (92.5) 
where x= 1-(1-§—p*7)t (1 -C—7)b—y. (92.6) 


W still involves a function of two variables. We conclude that the 
concentric system has n+1 independent aberration coefficients of 
order 22 —1. Here it was assumed that the object is not at infinity. 
When it is, the part G(o, 7) of (92.5) is irrelevant, and the number of 
coefficients is therefore reduced to m. At the same time it becomes 
easier to write down the displacement e’ (= €,), which becomes 


ge (0- op | G, +20 he. (92.7) 


a’ 
Since e’ must vanish in the paraxial region when OQ, is taken at the 


meridional focus (x’ = x’), 


x’ = f, = G,(0,0). 
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Setting f, = 1, by choice of the unit of length, (92.7) becomes 


Pas (5 -£) (1-aG,). (92.8) 


In particular, when spherical aberration of all orders is absent, i.e. 
e’ =o when f = y = 0, we have to have 


G(x, T) =X (92.9) 


rejecting an irrelevant additive function of 7; a result which should 
be compared with (68.8). Under these conditions we then have 


é = (5-£) (1a), (92.10) 


which is identical with the y-component of (69.2). 


Problems 


P.8 (i). Investigate the consequences of the reversibility of doubly 
plane-symmetric systems, obtaining explicit results at least in the 
third-order region. (‘The system forms a sharp paraxial image.) 


P.8 (ii). Obtain the result (88.14). 


P.8 (iii). How would you define a ‘cylindrical point characteristic’ 
analogous to the spherical point characteristic? Define theaberration 
function, granted that the ideal behaviour of K is that of Section 86. 
Write down generic expressions for e,, and €, analogous to (37.4). 


P.8 (iv). A doubly plane-symmetric system forms a sharp plane 
image, the disposition of the axis of K and of the various normal 
planes being as usual. Find the generic form of the point charac- 
teristic. Correctly to the third order, write down the distortion 
functions which appear, as power series whose coefficients are to be 
identified with those which occur in (88.6). 


P.8(v). Is the name ‘circular coma’ appropriate to the (third-order) 
linear coma of a doubly plane-symmetric system? Justify your 
answer by a detailed discussion. 


14-2 


CHAPTER 9 


SYSTEMS WITH TOROIDAL SYMMETRY 


93- Definition of the toroidally symmetric system 


A system K is toroidally semi-symmetric if it has an axis of symmetry 
@ which does not have points in common with both the object space 
and the image space. This last qualification is important, for without 
specification of the situation of the axis of symmetry relative to 
object and image one does not know whether one is talking about a 
toroidally semi-symmetric system, or the kind of system simply 
called semi-symmetric in Chapter 7. K is toroidally symmetric if 
there exists a plane of symmetry containing @. The imagery of such 
a system is still pretty complex. A considerably simpler situation 
obtains if K also has a plane of symmetry normal to @; and then 
we call it t-symmetric. One now has two planes of symmetry whose 
line of intersection ¥ is the axis of K. The object plane, image plane, 
and pupil planes (when the latter can be properly defined) are all 
planes normal to .~, as usual; and, in the context of the definitions 
above, they were naturally not regarded as ‘parts of K’. Our 
discussion will be brief, and will revolve for the most part about 
t-symmetric systems. 

A simple example of a ¢t-symmetric system is provided by (a 
part of) a circular torus made of some refracting medium. This is a 
very specialized case of course, not so much because only a single 
dielectric medium is involved, but rather because the meridional 
cross-section of the system is circular. (In general the cross- 
section could be bounded by any curves symmetric about ~, 
and then we speak of a forozd.) In terms of a set of Cartesian axes, 
pairwise parallel to those of the usual coordinate basis, and with 
origin at the axial point C of @, the equation of the torus is 


[x*+y?+2°—(Ri+ Re)? —4Ri(RE—y") =0, (93-1) 


where |R, + R,| and R, are the principal radii of the surface at its 

point of intersection with the sagittal and meridional planes respec- 

tively. Actually (93.1) represents a surface 7a good deal more com- 
[ 212 ] 
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plicated than any we have contemplated hitherto in as far as, taken 
as a whole, it is not simply connected. This means that we can draw 
closed curves on 7 such that two points on either side of the curve 
can be connected bya second curve which does not intersect the first. 

The kind of complication just remarked upon is largely irrelevant 
here since we do not consider toroids as a whole but only the parts 
lying to one side of the normal plane through C, and of any such 
part we again only take a part, namely either that which is convex or 
that which is concave towards @ near the sagittal plane. More 
simply, and more generally, any refracting surface (or any surface of 
constant refractive index when this varies continuously) is to be 
written in a form in which x appears as a convergent series in ascend- 
ing powers of y? and z2. With regard to the example of the torus, 
this gives rise to four alternatives, namely 


R : y 2 
+x = ( +R)-4(+2+R5R) + (93.2) 


where the alternative choices of sign on the left and on the right are 
to be made independently of one another; but when R, is suffi- 
ciently nearly equal to R, one possible choice on the right is ex- 
cluded to all intents and purposes, i.e. if regularity is to obtain in not 
too small a region surrounding the sagittal plane. Clearly we can 
‘generate’ a ¢-symmetric system of refracting surfaces by drawing 
any set of curves in the meridional plane, each curve being sym- 
metric about .; and then rotating these curves about the line @. 
We observe straight away that if the toroidal property of K is 
expressed by the condition that K be invariant under rotations 
about @, then we must restrict these rotations to be sufficiently 
small. The situation is entirely analogous to that which we met in 
the context of the concentric r-symmetric system in Chapter 6B. 
There also we had to restrict the magnitude of the rotations, since 
otherwise we would always have ended up with complete spherical 
surfaces, and this was not desirable. We remark that if we intend 
that together with every surface x = f(y*, 2”) contained in K the 
branch x = —f(y*, 27) occur in it also, then we need only impose 
upon K the condition that it be reversible as a whole (see Section 97). 

It may be remarked that one can take @ as the axis of an r- 
symmetric system. However, with the usual choice of coordinate 
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basis (i.e. with the ¥- and X’-axes along @, etc.) one cannot maintain 
the ray along @ as base-ray: the system is irregular. Thus, again 
referring to the example of the torus 7, if R, > R, there is no par- 
axial region in the sense that a ray sufficiently close to @ will never 
encounter 7 at all; whilst if R, < R, the ray along @ will encounter 
asingularity. 


94. The angle characteristic 


A #-symmetric system is, in general, anamorphotic. Not to be 
burdened with the consideration of too many special cases we 
suppose once and for all that f, and f, are both finite. Then let 7, 
be the angle characteristic when both base-points are taken at C. 
For reasons by now familiar, y’ and y can occur in J, only through 
the function aw’+ yy’. Symmetry about the meridional plane im- 
plies that we need not explicitly consider ay’—a’y, since this 
reverses sign when y and y’ do so; whilst even powers of ay’ —a’y 
are redundant in virtue of the identity 


(ay' —a'y)P+ (aa' + yy’)? = (1— B*)(1—f). (94-1) 
Again, symmetry about the sagittal plane requires that 7, depend 
upon f’, # only in the combinations = 8, 7 = f’P, ¢ = f. 
We therefore write T, = HE, 1,6), (94.2) 


where X= I1-(aa'+ pi +yy’), (94-3) 
as in (65.10). The inclusion of the term £/’ in (94.3) at this stage is 
merely a matter of convenience. If Op, Og be at distances gq, q’ 
to the left and right of C respectively and T refers to these as base- 
points, we therefore have the generic result 


T = JE, 9,6, x) +q'u' + qa. (94-4) 
Though comparatively simple in principle, the situation is evi- 
dently getting rather involved from a computational point of view. 


95- The aberration function 


The sagittal section of a f-symmetric system is like that of a con- 
centric r-symmetric system, so that one cannot have sharp imagery 
associated with any pair of conjugate planes. As remarked already 
several times, this does not prevent us from defining the ideal angle 
characteristic to correspond to whatever imagery we desire. For the 
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sake of illustration we shall take the object to be at infinity. Suppose 
then that a sharp image could be formed in.¥’ such that 


y =fibla, 2 =fey|a. (95-1) 
K must of course have been so designed that the meridional and 
sagittal sections have coincident foci. Then, with the anterior base- 
point chosen arbitrarily, 


Th = (87, v")— (fi BB +hayy'Via. (95-2) 
This incidentally also shows at once that K cannot have the property 
represented by (95.1), since 7, obviously cannot be written in the 
form (94.4). Now, if gis the distance between B and C, 


t= JHE, ox)+q'e' +qa+(fAPPe’ t+fyy'i% (95-3) 


terms depending on f and y alone being consistently omitted. 
At this stage it becomes convenient to write 


f=7", m=Vy% &=7 
for the time being. Then set 
J = (hy § + hay + hy x) + (p18? + pod + pgSE + 2p,EX + P59f 
+ 2peIX + 2p, OX + 4Pgx*) + O(6). (95-4) 
The first-order terms of ¢ are therefore 
t) = 3(Ry + hy—q') E+ (Ra— hg tft) 9+ 3(Ra— 9) b04+ (fo—Fa) No 
(95-5) 


Since the imagery is supposed to be in accordance with (95.1) 
in the paraxial region, this shows that 


k,=0, k, =fe-fy ka=fo WT =hfe (95.6) 
Then 


t®) = (py + Pat Ps) $+ (P2—2patPe—4Ps) 9 
+ (pst pa—2Pe+ pz + Sp3—4f2) $F 
+ (Pat 2s) £62 — 2(pa+ 2Ps) S92 + (Pat 2Ps—4f2) E00 
+ (ps +Pe—2h7 — 4Pst $f1) 06 + (Pe— 42s) 182 
— 2(Pe—4Ps) 12+ (Pe — 4s + 3/1) 162+ (Pr + 2Ps— the) obo 
—2(p, + 2pg— fo) Sa + Peb3 — 4Ps bata t (Ops — the) $0So 
—(4P3—3f2) N2%e- (95-7) 
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This corresponds almost, but not quite, to (88.6). What remains to 
be done is to express f’, y’ in terms y,, and 2, respectively, according 


t tf / fd , 
B=f\b-ys Y' =fey—%, (95.8) 


where y,, 2, are, to the required order, the coordinates of the point 
of intersection of a ray with a normal plane which lies at a distance 
d'(= 1) to the left of %’. If one appropriately interprets €,,... in 
(88.6), the latter will then just represent 7@, the sixteen earlier 
coefficients p,, ...,$;_ being linear combinations of the eight third- 
order coefficients which occur in (95.4). Since then €3 = d®/dyz, 
(88.7)-(88.10) will effectively represent the third-order displace- 
ment; and the coefficients governing the various partial displace- 
ments mutually ‘interact’ in a way precisely, analogous to that 
described in Section 87. 

As regards the number of independent aberration coefficients 
of order 2n—1, the situation is once again essentially similar to 
to that of Section 75, so that the number in question is (m+ 2). 


96. On the formation of a sharp image when the object 

is at infinity 
In the light of the results of the last section one will naturally ask 
whether a ¢-symmetric system can form a sharp image of the plane 
J at infinity; and if so, one will want to know the shape of the image 
surface .%*’, Now, if .%*’ exists, there must be functions D,(¢, ¢), 
DE, &) and C(€, ¢) such that, in the usual way, 


¥’ = ADYE.&) = 9" +O &) Fan) 


Bet (96.1) 
Z’ = yDiE,&) = ¥ +C(E&) ‘la’, J 
and D,(0, 0) =fy Do, 0) = fo, Clo, 0) = 0. (96.2) 
Since y’ = — 8T/0P’, (96.1) implies that 

T = —a'C—9D,—1,D,+8, (96.3) 


where g is some function of ¢ and ¢, only. (96.3) must now be 
compared with the generic form of 7, 1.e. 


T = JE, 4, ¢, 1 —aa’ —4 — a) + qa’ +90. (96.4) 


Here 7, occurs only in the fourth argument x of J. If one imagines 
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J to be written as a power series in y, y” will, when > 1, contain 
M2 multiplied by a function depending upon &,. Since D, does 
not depend upon this variable, it follows that J must depend 
linearl ; 

AY ODN SAY T= aE, 6) + xO(E 08): (96.5) 


Comparison with (96.4) then shows in the first place that 
b(é, UB g) = D&S, Ca)s 


i.e. D, and b are both functions of € alone. Because 7? = £€ we can 


] it = fs 
eee aE, 6) = mE, 6) + aE, 0), 
and then inspection of all terms involving 7 shows that we must have 


acs, 6) a b(S) + Di, be) = 0, 


and therefore @ and D, are functions of ¢ only. We are now left 
with the identity 


a(S, 6) + (S) + [9’ + C(E, Se) —ab(C)] a’ + ga — a(S, G2) = 9. (96.6) 
Here £ occurs only in @ and in «’. However, in the latter we always 
have § combined linearly with £4, whereas & does not depend upon 
this variable. Hence & cannot depend upon &, and the factor multi- 
plying «’ must vanish, i.e. 


CUS, $2) = — 4g’ + (6) (1 —F- 6) 8. (96.7) 
The remaining terms of (96.6) then give 
&(& Sa) = (6) + (6) + 9(1 —C—&)8. (96.8) 


The angle characteristic corresponding to the situation under 
discussion is therefore finally 


T = q’a' —(aa' + yy’) DS) — BB’ Dy) +S 62). (96.9) 


We thus see that a sharp image of an object at infinity is indeed 
attainable. The parametric equation of %*' is 


(AU + gh +2? = (1—¢) DiC), Y? = CDC). (96.10) 

Here (96.2) should be recalled, along with the last member of 
(95.6), which also follows from (96.7). 

£*' is a surface of revolution with @ as axis, a result which is 

perhaps not very surprising. It is possible for .4*’ to be a circular 

cylinder, namely when D, = (1~-0)4fy, (96.11) 


218 HAMILTONIAN OPTICS 


At any rate, we have found all possible functions, as given by (96.9), 
which could serve as ideal characteristic functions TJ) (when the 
object is at infinity) if one requires JT, to represent a kind of sharp 
imagery whose realization by K is not precluded a@ priori. In 
later sections we shall see how the shape of .¥*’ is further restricted 
by various ancillary conditions imposed upon K. 


97. The system reversible as a whole 


The condition of reversibility on K as a whole is that the angle 
characteristic, taken with respect to base-points disposed sym- 
metrically about C, must be invariant under the mutual inter- 
change of B and 8’. Since under this operation y is invariant in any 
event, we see from (94.4) that reversibility requires the relation 


JE, 9, 6 X) = HS, 9, X) (97-1) 


to be identically satisfied. The number of independent third-order 
coefficients is then reduced from eight to six, since (97.1) entails 


eres) Po2= Ps Ps = Pai (97-2) 
only the coefficients relating to fixed conjugates being taken into 
account as usual. The coefficients p, here are of a kind somewhat 
different from those which appear in (56.4), say, in the sense that 
their vanishing does not entail the absence of aberrations. For this 
reason the observation that the conditions of reversibility (97.2) 
are homogeneous is irrelevant. With regard to higher orders, one 
has five fifth-order relations of the kind (97.2), nine seventh-order 
relations, and so on. . 

If the condition of reversibility be imposed under the circum- 
stances contemplated in Section 96 we have to require the function 
on the right of (96.9) to be invariant when @ and @’ are interchanged, 
g having been chosen to be equal to q’. It follows at once that D, 
and D, must be constants, and that g’a’ —g(£, §) is a constant. Thus 


T = const.+(a+a’—aa’—yy')fo—BR/- (97-3) 
The surface ¥*' has the equation 
(X'+f)24+Z2 | ¥" 
Ae td, 
‘7 Fi (97-4) 


i.e. it is an ellipsoid of revolution. 
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98. The meridional section reversible 


Even when K as a whole is not reversible, the meridional section 
may be, i.e. there may exist a line #, normal to .~, about which it is 
symmetric. In view of the invariance under rotations about ¢ 
this implies reversibility, in an obvious sense, of the section of K 
by any plane containing @. Now let L be the axial point of f and 
take base-points which are situated symmetrically about L. Then 
there exists a constant p such that J(E, 7, ¢, x) + pa is invariant under 
the mutual interchange of f and f’. It should be borne in mind that 
we now have to take £.=79,=¢,=0, and that one can always add a 
constant multiple of a+’ to any function whose invariance under 
the substitution in question is under consideration. By way of 
example, when the system as such lies entirely to the left of the 
normal plane through C, then p is twice the distance between L 
and C. To proceed from here, let us suppose for the sake of sim- 
plicity that the object is at infinity. Then, to require the invariance 
of J+pa is tantamount to requiring the invariance of J alone, 
since the presence of the additional term will have consequences 
only as regards relations involving coefficients multiplying powers 
of ¢ alone, and these are of no interest. The reversibility condition 


theref d t 

eens EES HE, 8) = HE) (98.1) 
where X= 1-9-(1-§)# (1 — 68. 
This is considerably weaker than (97.1), on account of the absence 
of the variables y and y’. Indeed, writing (98.1) out in full in the 


third-order region, we are concerned with the invariance of the 
expression 


DiS" + Point PabS+ poh + (dsb + Pe + P76) (E—29 + 6) 
+ p(E—29 + 0)*+$hk,(f? — 226+ 67). (98.2) 
The required equality of the coefficients of £7 and 9 gives 


P2—Ps— 2(P_—P,) = 9, (98.3) 


which should be compared with (97.2). It is worthy of note that 
when p = 0, i.e. @ and £ coincide, K is in fact reversible as a whole, 
but the two separate conditions (97.2) do not emerge here. We shall 
meet a somewhat analogous situation in Section 99. 
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When the present kind of reversibility obtains we may again 
investigate the possibility of a sharp image being formed of an 
object at infinity. Writing 


AUS) = (2 —€)tD,(¢), (98.4) 


and f,g2(¢) for the sum of all functions of ¢ alone which appear when 
& = 9. = & = 0 in J+ pa, one has the condition 


(1 —£)8 d(C) —fa'9D,(C) + a(6) = (1 -6)# dE) 
—fa'9D(5) +88). (98-5) 
Evidently D,(¢) = const. = f,. Then, setting = 0, and recalling 
that d(o) = 1, there comes 
d(S) +a(S) = (1-$)8 +g(0). 


Resubstituting this in (98.5) one obtains an equation which may be 
written as 


[(x -—§)# — 1] [1 —d(€)] = [(2 -$)- 1] [1 - dG], 


and this shows that the ratio of 1—d({) to (1—%)—1 must be a 
constant, k, say. ‘Thus 


d(f) = 1 +l —(1 —€)8]. (98.6) 


In the first of equations (96.10) one has on the right just £3d%(¢), 
whilst the second here reads Y’2 = {2¢. Hence, under the present 
conditions, .%*’ has the equation 


X24 Z — f(x +k) — A(x — VHA, (98.7) 
the origin having been moved to C. To appreciate the nature of 
this surface, it suffices to consider its meridional section, since it is 
a surface of revolution about @. Setting Z’ = o we then have 


[2 xX"—f,(k+0) PY? _ 
Bp te = (98.8) 


This is the equation of an ellipse, so that %*’ is that part of an 
elliptical toroid which passes through the axial ideal image point. 
The special case of a torus arises when k? = f7/f3. When k = — 1 one 
has the ellipsoid (97.4), whilst the circular cylinder, to which 
(96.11) refers, has k = o. 
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99. The meridional section concentric 


It remains to study the consequences of the concentricity of the 
meridional section. Physically the state of affairs now contemplated 
is that every surface of constant refractive index (or else every 
refracting or reflecting surface) is the surface of a torus, the generic 
equation of which is given by (93.4) with R, fixed, whilst R, is to 
be regarded as a variable parameter. If the ‘centre’ C* of the meri- 
dional section lies at the point x = p (where p may be negative), 
the angle characteristic T*, referred to base-points coincident at 


oe T* = JE,1, 6, x) + P(@' —2). (99-1) 


Concentricity requires that this function depend on f# and f’ 
only through the function aw’ + £6’ when £, = y_ = € = 0, ie. 


HEs0, & 1 a0 ~ BB’) +p(a! —a) = T(aa’ + BB), 
say. It follows that, with 
T= 1—(1— f?)8(1— 8)8 — Bf’, (99.2) 
IE, 9, $5X) = G7, x) + p(%—@'), (99-3) 


where G is some function of two arguments. The angle character- 
istic of the ¢-symmetric system with concentric meridional section 


is therefore T = G(r, x)+(q' —p)a’ +(q+p) a, (99-4) 


and, using this result, the imagery may be discussed in the usual 
way. 

One interesting conclusion which follows from (99.4), with 
p +0, is that a system of the kind under consideration can never 
form a sharp image of an object at infinity at all. To see this, one 
need only compare (99.4) with (96.9). The linear occurrence of 
yy’ in the latter, coupled with the fact that the factor multiplying 
it depends upon € only, requires that G depend linearly upon x. 
Analogous reasoning concerning the term in £f’ then shows that 
G must also depend linearly upon 7; and then there remains no 
way in which the variable £,, which occurs in pa’, can be accom- 
modated. 

The case with p = 0 is exceptional, since £, then no longer enters 
into the argument, and one finds that a sharp image may be attained 
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if, and only if, D,(¢) = D,(€) = const. = f, say, and G(r, v) = fx. 
Then, however, one has simply the characteristic function of a 
concentric (7-symmetric) system. 

The conclusion at which we have just arrived might be held to 
be obvious, were one to think that when p = o the system is neces- 
sarily concentric in the sense of Chapter 6B. Yet we are made 
suspicious by the fact that, in (94.4), J gives no indication of re- 
ducing to a function of x alone when p = o. The fact of the matter 
is that when C and C* coincide, a system which is t-symmetric with 
concentric meridional section need not be invariant under arbitrary 
rotations. However artificially, one can devise a direction-dependent 
refractive index, e.g. a function of x?+y?+2? and of (xy — 2a)’, 
such that the system, though internally anisotropic, has all the 
symmetries here required of it. We have, in fact, thus returned to 
the point made at the end of Section 84. 


Problems 


P.g(i). Find the number of aberration coefficients of order 21 —1 
of the ¢-symmetric system with concentric meridional section. 


P.g (ii). The spherical aberration of a t-symmetric system vanishes 
to all orders for an object at infinity. What conditions must the func- 
tion J of (94.4) satisfy? Show that your result is consistent with 
equations (95.6) and (95.7). (It is given that the system forms a 
sharp paraxial image.) 


P.g (iii). A parallel bundle of rays incident upon a system like 
that of P.g (ii) produces a circular image patch (8 = y = 0). What 
condition must be satisfied by the function J of (94.4) so that this 
may be the case? 


CHAPTER 10 


CHROMATIC DEFECTS OF THE IMAGE 


100. Introductory remarks. Chromatic aberration 
coefficients 


Hitherto we have concerned ourselves exclusively with the mono- 
chromatic image produced by given systems. This means that the 
light traversing K was always supposed to have a single fixed wave- 
length. In practice, however, it is quite usual for the light to be white, 
or, at any rate, to have some spectral distribution in which various 
colours are represented. It therefore becomes necessary to take 
due account of the fact that the distribution of refractive index 
within K depends upon the wavelength A. In a sense therefore, 
every system K, imagined in the first place as relating to some fixed 
value of A, generates a whole collection of systems, one for every 
value of A to be contemplated; granted, of course, that K is not a 
purely reflecting system. 

It is a matter of great convenience to refer the properties of K at 
wavelength A to those which obtain for light of some fixed colour, 
henceforth called the base-colour. The wavelength of this will be 
denoted by As and, consistently with this notation, if Q(A) is any 


quantity depending on A, then Q shall stand for Q(A). Though the 
0 0 
choice of A is to a certain extent arbitrary, we shall see in Section 101 
0 


that it is, in practice, approximately determined by the extremes of 
the range of wavelengths admitted in any particular problem. 
Now let F(q,, Ya, 93) @4) be the characteristic function of K when 
0 


A = A, so that F is a certain optical distance measured along the 
0 0 

ray & specified by the values of the ray-coordinates in question. For 
0 


a different wavelength the ray # specified by the same values of the 
q, will, in general, differ from %, and in any event the optical length 
0 


in question has to be calculated with the refractive index appro- 
priate to A, not A. It follows that the value of F will differ from that 
0 


[ 223 ] 
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of F. In short, the characteristic function is now to be regarded as a 


0 
function of both the ray-coordinates and of A. 

At this stage we meet for the first time a problem of a peculiar 
kind. It is this: F will usually be a very complicated function of A, 
just as it is a very complicated function of the q,;. Therefore one will 
naturally be inclined to write F now as a power series in the five 
variables q1, Yo. Fg, Ya and OA( = A— A). For the base-ray &p all five 

0 


of them of course take the value zero. The dependence of F upon 
6A reflects the dependence of the refractive index N upon dA, so 
that N also will then have to be written as a series of ascending 
powers of dA. However, if this procedure is to have any usefulness 
at all, the series in question must converge reasonably quickly for 
ranges of dA encountered in practice. It is unfortunately true that 
this condition is certainly mot satisfied by N: the convergence is 
exceedingly slow. This, then, is the difficulty to be overcome; 
that is to say, we have to find some parameter, w say, which is a func- 
tion of SA and vanishes together with it, such that the series for N 
in ascending powers of w will converge rapidly, and that for F along 
with it. Such a parameter can, indeed, be found in any given 
situation, and we call it the chromatic coordinate; see Section 101. 
We now write 


ioe) 


F = F(q1; 92) sa ®) = Fqy Jos 93.94) @™. (100.1) 


m=0 


The ideal characteristic function F, is defined, as always, in terms of 
the desired ‘ideal’ behaviour of K; and one will usually require this 
behaviour, i.e. not F;, itself, to be independent of w. The aberration 
function is then 
io 9) eo cO 
f=F-Fvo= XTfh= xX dX fro. (100.2) 
n=0 n=0m=0 m 
It contains all the information concerning the extent to which a 
ray through O will in fact fail to pass through the required point J’, 
whatever its colour may be. 
To every aberration function of order m encountered hitherto 
there now corresponds an infinite sequence f,, of aberration func- 


™m 
tions of coordinate order n and chromatic order m. Note that f,, now 
stands for > f,,w™, and it must not be confused with f,,. At this 
mm 0 
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point we need to remark explicitly on the occurrence of the terms fy 
and f, in (100.2). Taking f, first, this will always be present when, 
in accordance with the demands of actual practice, displacements 
are referred to a fixed reference point J’ lying in a fixed image sur- 
face, i.e. that corresponding to the base-colour. Therefore, even 
when the parabasal rays from O unite in a common point in the 
image space, this point will, in general, not coincide with J’. On the 
other hand we shall regard quantities such as the magnification 
and the focal length as fixed, i.e. independent of A; all effects of 
their actual non-constancy already being taken into account by the 
aberration function. 

The presence of f, has a different origin. It will be recalled that 
under the most general circumstances the coordinate basis was 
suitably arranged relative to the base-ray Z,. If we agree that the 
coordinate basis is to be fixed, it follows that if Z, refers to A then 


at wavelength A the ray specified by g; = 0 (2 = 1,...,4) will not, 
in general, coincide with Z. Thus when A + A one has the un- 
0 


avoidable complication that the displacement already contains 
terms of order zero. So much for matters of principle. In order 
not to overburden ourselves with an excess of detail we henceforth 
consider only systems which are at least doubly plane-symmetric, 
in the sense of Section 88. Then the base-ray can be taken along 
and will therefore be independent of A, so that fy = o. In future 
all sums over 7 will thus start with n = 1, rather than n = o. 

To return to matters of terminology, fn will naturally be called 


the mth-order ‘monochromatic aberration function’. Again, since 

in practice one rarely needs to consider chromatic orders beyond 

the second, or at worst beyond the third, we refer to f, (fns fn» ---) 
1°23 


as the ‘primary (secondary, tertiary, ...) chromatic mth-order aber- 
ration function’, the qualification ‘coordinate’ being suppressed 
here, since it is largely redundant. Whatever we lose here through 
long-windedness we more than regain in precision. With regard 
to the displacement, we have, analogously to (100.2), 


eo ioe) to @) 
= Yew=)D D €,0”, (100.3) 
m=0m n=1m=0m 
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so that, for example, e’ is the total monochromatic displacement, 
0 

e,,@ the primary chromatic mth-order displacement, and so on. 

1 


We can extend this terminology to types of aberrations without 
difficulty ; bearing in mind that the chromatic displacement is essen- 
tially similar to the monochromatic displacement from a geo- 
metrical point of view. Since we are supposing K to be at least 
doubly plane-symmetric we may use an extension of the notation of 
Section 16. Thus 

o nm fb 
fP= TDD fer ryerlram. (100.4) 
m=0 p=0r=0m 
To avoid irrelevant verbal difficulties we proceed in terms of a 
specific case, namely, when K is symmetric, with F = V. Then 
vy is the (characteristic) coefficient of monochromatic (27 — 1)th- 


order spherical aberration, vf} is that of primary chromatic (2 —1)th- 
1 
order spherical aberration, and so on. Similarly, v§, say, will be 
2 


the (characteristic) coefficient of secondary chromatic fifth-order 
circular coma. In short, the way in which the terminology extends 
itself to the various coefficients, whether characteristic or effective, 
governing the aberrations of the various types, virtually explains 
itself. 

We may, of course, also continue to speak of ‘spherical aberra- 
tion’, ‘circular coma’, etc., as a whole. ‘Thus we previously had in v 
functions S(£), C(é) corresponding to these types, according to 
equation (26.1). Under the present circumstances we shall have to 


mnie v = ve + S(E, w) +9C(E, w), (100.5) 
but the general situation is essentially the same as before; see 
Section 107. 


With regard to the state of chromatic correction, various terms 
are in common use, such as achromatism, apochromatism, and the 
like. It will be more convenient to return to these at a later stage; 
see Section 104. We may, however, raise at this point the question 
how remainder terms are now to be denoted. Previously, when 
truncating a series (say one representing a displacement) after its 
nth-order term, i.e. after the terms of degreez in the ray-coordinates, 
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all rejected terms were collectively denoted by the symbol O(n + 1), 
for O(n) stood for any sequence not containing terms of degree 
less than 2. This notation is now amplified by understanding O(7,,) 
to stand for any collection of terms none of which is of degree less 
than 7 in the ray-coordinates and less than m in the chromatic co- 
ordinate. However, a further slight extension is required in view of 
the fact that m will generally depend upon 7, in a sense adequately 
illustrated by the example of the displacement of axial rays in a 
symmetric system. Let it be approximated by the expression 


é = (cot+ctw*) pt (pt +pt a) p?+ sf; (100.6) 
1 2 0 1 0 


that is to say, in some particular case it is assumed that knowledge 
of the five coefficients which occur on the right implies a sufficiently 
accurate description of axial chromatism and spherical aberration 
for the values of p and w of interest. Evidently the remainder term, 
if it is to be sufficiently explicit, must be written as O(13, 32, 51) 70): 
Then one knows exactly that the terms not included explicitly 
consist of the paraxial terms which are at least cubic in , of the 
secondary and higher-order chromatic primary spherical aberration, 
and so on. 


101. The chromatic coordinate 


As has been explained, to achieve rapid convergence of the various 
chromatic power series, we have to choose a function w(dA) such 
that N (or, for that matter, 1/N) when written as a series in ascend- 
ing powers of w, in some sense converges sufficiently rapidly. We 
write the series as “ws 

ON =N-N=EN DY vy,w™. (101.1) 

0 Om=1 

K is here supposed to be made up, at least in part, of various homo- 
geneous media, so that each such medium is characterized by the 


values of N and the constants v,, which occur on the right of (101.1). 
0 


w, on the other hand, must be the same function for every medium 
within K. This means incidentally that the results of investigations 
which attempt to represent N as the linear superposition of several 
universal functions of A are irrelevant to our task, however interest- 
ing they may be in other respects. 


15-2 
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Now, within the visual spectrum the refractive index of ordinary 
crown glasses is quite well represented by Hartmann’s equation 
N=N,+— 
hue er (101.2) 
where Nj, c, A, are constants of the particular glass. For any such 
glass, therefore, there are constants v, and a such that 


Np, 0A 
_ 0 
Ne aa (101.3) 


for any choice of A. Taking the micron(= 10-*mm)as a convenient 
0 
unit of length and A to be Ap, i.e. the wavelength of the sodium-D 
0 


line, v, and a may be approximately fitted to a wide range of glasses. 
It turns out that « never differs very substantially from §. Accord- 
ingly we now define, with A = Ap, 

0 


O= wees (101.4) 
1+230A 


If for any reason one adopts some other value of A, e.g. if the range 
0 


of wavelengths of interest is other than the visible, then the number 
multiplying 5A in the denominator should of course be chosen 
appropriately. 

With (101.4) the series (101.1) does indeed converge quite 
rapidly and with, say, four terms the representation of the refractive 
index of any glass should be adequate. It must be emphasized that 
it would be quite wrong to object to(101.4) on the grounds that many 
glasses do not obey Hartmann’s formula well: all that concerns us 
is that the function y,w represents dN, iy vastly better than does 


py, 6A. In practice one should proceed as follows. ‘Taking the de- 
mands of the particular design problem into account, one first 
decides to what chromatic order one intends to proceed. Let it be 
p(> 1). The series (101.1) is then truncated after the pth term, 
and the resulting polynomial of degree p fitted to p + 1 known values 
. , Ni, No, -.-) NV, of the refractive index. This at first sight perhaps 


somewhat strange procedure has the great virtue that the value of 
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f{%, as given by the truncated series 


Ms 


bi w™, 
lm 


m 


differs from its actual value when w takes the values corresponding 
to N,, Nz, ...,N, solely on account of this truncation, i.e. it is not 
due to the inaccurate representation of N by a polynomial of 
degree p. It is, of course, understood that where 1/N is required 
in the course of calculation, it is to be written as a power series 
whose product with : 
N (: + > Um on) 
0 m=1 
is exactly unity. 
When the constitution of K differs from that contemplated above 
we shall take it for granted that an appropriate chromatic coordinate 
can always be defined. For example, in an electron-optical instru- 


ment w will be taken proportional to E—E, where E is the actual 
0 


energy of electrons, whilst E is a ‘base-energy’, e.g. the mean 
sy i Sy, &g 


energy, all these energies being measured in the field-free regions 
of the system. 


102. On the reduction of distances 


With our original convention concerning the constancy of the 
refractive indices in the object and image spaces it became possible 
to eliminate the explicit appearance of N and N’ from our equations 
by introducing reduced distances (Section 6). We can, alas, no 
longer usefully adhere to this convention when N and N’ have to 
be considered as functions of w. If this symbol were to stand for 
the reduced object coordinates we would have the unhappy situation 
that a given object point would no longer be characterized by the 
constancy of y. This feature would be likely to lead to endless 
confusion, and we have no option but to abandon the reduction of 
distances as previously understood. 

However, in order to have equations which become formally 
identical with those of the purely monochromatic theory when 
@ = 0, we shall agree to understand all distances to be quasi-reduced, 
meaning that if X is any distance in a region in which the base- 
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value of the refractive index is N (independently of the ray-coordi- 
0 
nates) then X shall in fact stand for NV times the actual distance in 
0 


question. Then m, for instance, has exactly the meaning attached to 
it hitherto, bearing in mind that quantities such as d’, m, f etc., are 
in any event to be taken as referring to the base-colour only. 

Let us see, by way of example, what the ideal point characteristic 
of a symmetric system looks like, under the circumstances con- 
templated in Section 15. Beginning with unreduced quantities 


V, = V(O, O’) — N’[d’2-+(y’ —my)? + (2’ —mz)?}. (102.1) 
Quasi-reduction implies the transcription 
d’>d'|N’, y'>y/N’, y>y/N, m>Nm/N', 
0 0 0 0 0 
and then (102.1) becomes, on setting the fixed constant d’ equal to 


Gare Ve = 9(6)— v(x +0), (102.2) 
where y= NIN =14+ Dd vnw™, (102.3) 
0 m=1 


in view of (101.1); and the other symbols have their familiar 
meanings, though g may now depend also upon w. In particular 
yi = my is the quasi-reduced ideal image height, which is as it 
should be, since the ideal image point for all colours is the same as 
that for the base-colour under the procedure here adopted. 

It will be seen that Y does depend upon w; but, as regards ideal 
imagery, this dependence is compensated by the fact that in place 
of (6.1) we must now use the equations 


dV /éy’ = v'B’, oV/éy = —vp. (102.4) 


103. Paraxial aberrations 


The relatively detailed discussion of chromatic defects will be 
entirely confined to symmetric systems, not least because it will 
turn out to be an adequate prototype of what has to be done when 
one is confronted with other symmetries. 

The occurrence of the factor p’ on the right of (102.2) suggests that 
we write the first-order aberration function in the form 


Wo = v'(be,8 + C2), (103.1) 
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where c, and c, are ‘constants’ depending upon w, and 1 = cs = 0, 
The displacement is then easily found to be 

€, = Cy’ +¢y}. (103.2) 


We note in passing that even paraxially rays grazing the rim of the 
stop will not, in general, intersect the plane & in a circle concentric 
with E’ when w + 0, since for arbitrary values of w &” is not the image 
of the axial point of the plane of the stop. However, we are once 
again disregarding aberrations associated with the pupil planes, 
since to do so in the present context is surely justified in all but the 
most extreme cases. 

According to (103.2) there are two chromatic paraxial defects 
known traditionally as longitudinal and transverse chromatic aber- 
ration respectively, the qualification ‘paraxial’ usually being omit- 
ted. The reasons for this terminology are pretty obvious. In an 
out-of-focus image plane which has y = c,/(1 —c,) the displacement 
is independent of y’, so that axial rays come to a focus in the axial 
point of this plane. Moreover, one thenhas Y’ = [(1+c¢,)/(1 —¢,)] yi, 
so that the effective magnification exceeds m by an amount 


Mery —M = (C4 +Cp)/(I—Cy), 


which reduces to c, when c, = o. For this reason one also speaks 
of c, as governing the ‘chromatic difference of magnification’. 
These results are valid for all values of «, i.e. 


e= D (cry +¢2yi) eo. (103.3) 
m=1 m ™ 


Now take y, = 0 for the moment. Then, when c, is arranged to be 
1 


zero, €, will depend quadratically upon w when w is sufficiently 
small. This means that, near w = 0, rays whose colours are given, 
respectively, by the values w and — w of the chromatic coordinate 
are brought to a common focus, e, being stationary at w =o. If 
practical demands require €} to be stationary at the particular wave- 
length A, then one may either take this as the base-colour, or else 
design K in such a way that c, takes a certain very small value, de- 
‘ : 


pending upon that of c,. Under these circumstances one then says 
2 
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that K is achromatic. In more general contexts the traditional ter- 
minology is rather vague, a question to which we shall return in the 
next section. At any rate, when this kind of achromatism obtains, 
the residual chromatic displacement is often referred to as the 
‘secondary spectrum’. (This is an ugly terminology, and we shall 
avoid it. ) A much better state of (axial, paraxial) correction is ob- 
tained by designing K in such a way that C1 and c4 have values 


relative to each other and to c, such that the displacement is station- 
3 


ary at two points within the range of wavelengths of interest. 


With regard to the transverse aberration coefficients c, one has, 
m 


in principle, the same situation, but in practice one is usually con- 
tent with reducing c, to a sufficiently small value. 
1 


104. The meaning of achromatism and apochromatism 


We have already remarked on the very restricted connotation 

of ‘achromatism’, i.e. a system is achromatic if the longitudinal 

paraxial displacement is stationary at one point of the range of 

wavelengths of interest. In other words, the condition of achro- 

matism in effect places a limitation on only one of the aberration 

coefficients, namely c,. This may be good enough in some cases, 
1 


but in the case of systems of large aperture, for example, this 
implicit over-emphasis on paraxial chromatism is out of place. It is 
useless to devote endless effort into arranging the displacement of 
axial rays to be stationary at two, or even three, points of the range 
(i.e. ‘corrected for three, or four, colours’) if one ends up with a 
system in which the spherical aberration varies appreciably with o, 
or as one sometimes says, a large amount of sphero-chromatism 
remains. It is, indeed, pointless not to consider these types of 
aberrations jointly in the course of design. Furthermore, systems 
corrected for three colours in the restricted sense above, have some- 
times mistakenly been called apochromatic, whilst those corrected 
for four colours are referred to as superachromatic. This state of 
affairs makes it desirable to develop a more detailed terminology. 
Accordingly we call K simply achromatic with respect to a given 
type of aberration if the latter is stationary at an appropriate point 
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of the range of wavelengths being contemplated. Similarly, if it is 
stationary at two, three, ..., points we call it doubly, triply, ..., 
achromatic (with respect to the type in question). In the case of 
simple achromatism one may leave the qualification ‘simple’ under- 
stood whenever there is no risk of confusion. Further, one may 
speak of a given property of K as being achromatic (or achroma- 
tized) rather than of K as being achromatic with respect to this 
property. Thus we may say either that K is achromatic with respect 
to spherical aberration, or else that spherical aberration is achro- 
matic. 

In the example just cited we have spoken simply of ‘spherical 
aberration.’ Now it is clear that, save in exceptional cases, if 
spherical aberration is achromatic it cannot be so for all zones 
simultaneously. The statement above is therefore ambiguous 
unless a particular zone be specified. Explicitly, recall that the dis- 
placement due to spherical aberration has the generic form 


f 


€ 


I 


(p¥+p¥ot pt w+...) p?+ (sk +5fo+...)p° 
0 1 2 0 1 


+(Hf+tfot+...)p7+... 
0 1 


x DY wp”, 104.1 
4 


n=2m=0m 


I 


If ¢’ is to be stationary at w = 0 (say) for all values of p one has to 
have 


Loo} 
Ewer =o, (104.2) 


which requires that uf) = 0 (all m > 2); and this is asking too much 
1 


in practice. However, the situation is not as serious as it looks, 
since, on account of the way in which third-order spherical aberra- 
tion is usually balanced against the higher orders, it will often suffice 
to require that ui) = o only for n = 2 and 3. 

1 


The classical definition of apochromatism amounts to the con- 
dition that the (paraxial) longitudinal aberration be doubly achro- 
matic and that both spherical aberration and circular coma be 
simply achromatic. Nothing is said about the (paraxial) transverse 
aberration. Moreover, with regard to the non-paraxial aberrations, 
one still has to say something about what particular zone one has 
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in mind, or else interpret their ‘absence’ as meaning for instance 
p* = p¥ =0 and s¥ = s¥ = 0, as already discussed; the inclusion 
1 1 1 41 


of chromatic conditions of coordinate order greater than three 
depending upon the monochromatic state of correction. Terms 
such as superachromatism, and superapochromatism are also used 
occasionally, the former meaning that c, is triply achromatic, the 
latter apochromatism but with c, triply, instead of doubly, achro- 
matic. 


105. The third- and higher-order aberrations 


A factor v’ was included in the right-hand member of (101.3) merely 
as a matter of convenience. The extension of this formal device 
to all orders clearly suggests itself. Accordingly we define a chroma- 
tically modified aberration function—also denoted by v—by the 


relation V =Vet'e. (105.1) 


The displacement is then given as usual by (17.1). Owing to the 
presence of first-order aberrations, the third-order coefficients are 
no longer given by (23.2), but contain the coefficients c, and cp 
as well. One proceeds essentially as in Section 23, and, since the 
work is rather tedious but straightforward, it suffices to quote the 
result: 

PE = 401+ (34— 3c + 4), 

pk = 2p¥ = 2pyt+(—2¢, + 6g +c} —2€, Cg 4 CI CQ), 

PE = 2Pgt (cy — 2g + 2C1Cg— 2+ C43); (105.2) 

DE = Ay t (C4 — 2Cg + 2CyCg— CF +68), 

PS = Pst 3(3Co+ 34 +3). 


These hold for all values of w, so that the corresponding equations 
for the coefficients of the various orders may be read off from them. 
For example, 
PT =4pithe, pT = 4p1.+3(— 44); (105.3) 
1 1 1 2 2 2 1 
and so on. 
Under conditions in which one has to worry about chromatic de- 
fects outside the paraxial region, one will surely have achromatized 
the paraxial coefficients, i.e. o 0, Ca ~ o. Then (105.2) become 
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equations for the primary chromatic third-order coefficients which 
are effectively identical with (23.2). A similar remark applies to the 
fifth-order coefficients: one will not need to know them unless 
c, and c, are (approximately) zero. It follows that the primary 
1 1 


chromatic fifth-order coefficients are given by equations (23.3), if 
one imagines a subscript 1 to be placed under every coefficient which 
occurs in them; and one will hardly ever need to consider higher 
chromatic orders. | 

In practice one must not forget that the coefficients which occur 
in this section relate to the modified aberration function. If, in the 
absence of this modification, one distinguishes the corresponding 
coefficients by placing dots over them, the only change required in 
the equations for the primary chromatic third-order coefficients, 
for instance, is the replacement 


Pu > py —ViP,- (105.4) 
1 1 0 


In any event, v’ is very often unity in practice. 


106. Stop shifts 


To investigate the effects of stop shifts we go over to the angle 
characteristic. Our first task to this end is to inquire into the generic 
form of T, and of that of J) in particular. Since the aberrations are 
to vanish independently of the value of w for the pair of conjugate 
planes characterized by the magnification m (which relates to the 
base-colour) we must have 

mart oa! = ° (106.1) 
for allfrays, where k = p/p’. (106.1) implies that 7 must be a func- 
tion of € = (mf — BP)? +(my’ —y)* alone. Here m = m/x,and though 
this is formally a (properly) reduced magnification it is defined in 
terms of m, that is to say, we are not concerned with planes which are 
conjugate for the colour , rather than the base-colour. We naturally 
adopt the rotational invariants £, 7, ¢ as given by (15.11), but with 
m, s replaced by m and s( = s/x) respectively, s itself also retaining 
its previous significance. 


K 
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Paraxially J, is proportional to ¢, and we take the factor of pro- 
portionality to be v/2m, for this entails that 


y =—-7mp’. (106.2) 


An infinite object distance can therefore be accommodated by let- 
ting m tend to zero. Any other choice of the factor of proportionality 
creates difficulties in the context of this limiting process, and these 
we want to avoid. When aberrations are present, we write 


T = g(v) + (C/am+t), (106.3) 


having absorbed all terms of 7% which depend non-linearly upon ¢ 
into the aberration function ¢. Note that the latter is defined by 


T—T= vt, (106.4) 


i.e. a factor v, rather than v’, is included on the right. This has the 
consequence that the displacement 


e’ = (s—m) (20, + wt,) (106.5) 


has a factor s—m which is independent of w. Here o and uw are 
defined as in (24.2), but with 5, m replacing s and m respectively. 
Like (24.4), equation (106.5) is only superficially simple, for we 
have all the complications which we met in Section 24. Indeed, 
if higher-order chromatic aberrations are to be taken into account, 
the situation is worse, for one has to go through the kind of work 
following equation (24.5) even in the paraxial region. (See, however, 
the alternative procedure outlined at the end of this section.) 

To see what is involved let us therefore consider only paraxial 
imagery for the time being. We write 


t= feb tenn + 3036, (106.6) 


where, we note, the term 4$c,¢, which depends upon ¢ alone, has 
to be included, just as the term p,¢? had to be retained in equation 
(24.1). Then 
Yo = (K+sc,+ my) 6+ (1 —K+ SCy+ mC) p, 
(106.7) 
Vz = MC, + Cg) 6+[1 +m(Co+¢s)] B 
When m = o these reduce to 


y. = (K+5se,)o+(1—K+5c,)8, y, = 8B. (106.8) 
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The paraxial displacement is, in view of (106.5), given by 
ef = (s—m) (c,6 +c), (106.9) 


the primary chromatic part of which is 


ei = (s—m) (6+ coe) = (s—m) (C:¥e+¢2¥1); (106.10) 


since, according to (106.7), y, = ¢+O(14, 30)) Yr = &+ O(14; 39)- 
These equations may also be used in the primary chromatic parts 
of the right-hand members of (106.7); cf. the step following equa- 
tion (24.8). This suffices to obtain the secondary chromatic first- 
order displacement 


€ = (s—m) {[c, —(sef + 2mey cy + me + KCy)] Ye 
2 2 1 11 1 11 
+ [€g — (Se, Cy -+ mc, Cy + mc + mcyCz—KCy)] y,}. (106.11) 
2 11 11 1 11 11 


These results should be sufficient for most purposes. However, 
in principle one can include the terms of all chromatic orders by 
simply solving the pair of linear equations (106.8) for o and yw, and 
inserting the solution in (106.9). 

When we go on to the non-paraxial aberrations the complica- 
tions are less tractable, though one can still cope with those of the 
first chromatic order without excessive labour, mainly because the 
Co are absent. The kind of result one gets may be illustrated by the 


coefficient of primary chromatic third-order spherical aberration: 
PT = (s—m) {4p, — 3[4(sc, + mcg + K) py + m(cy +c) Po]}. (106.12) 
1 1 1 1 10 1 1 0 


It should always be borne in mind that in the various equations 
above we have set f = 1 throughout. 

There is not much point in continuing to write down equations 
relating to the most general circumstances, for in practice one is 
not likely to have to deal with chromatic coefficients of higher 
order unless the system is at least paraxially highly achromatized 
(c, % c, = 0). At any rate, the relations between the characteristic 
coefficients 2%) and the effective coefficients have been dealt with 

™m 


at sufficient length to show what is involved. 
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We can now use exactly the method of Section 44 to study the 
effects of stop shifts. We only need to observe that the position of the 
stop enters into J only implicitly through the presence of s in the 
relations between £, 7, € on the one hand and £,7, € on the other. 
That 5,7 now replaces s,m has no relevant consequences whatso- 
ever, bearing in mind that « will not occur in g, which continues to 
be given by (44.4). It follows that all equations such as (35.13), (44.8) 
and (44.10) may be taken over as they stand. Since all of these are 
linear in the various coefficients, any such equation decomposes 
into a set of similar equations relating to chromatic orders 1, 2,.... 
It only remains to add the equations for coordinate order 1, and 
they are 

@ =O, by = (Cg +Gey), Cy = Cyt 2Gegt+qrey. (106.13) 
Moreover, the investigations of Chapter 5D concerning invariant 
and semi-invariant aberrations evidently extend to the chromatic 
coefficients without effective modification. 

In view of the complexity of the relations between the charac- 
teristic and effective coefficients it is well worth while to spend a 
little time on considering the effects of stop shifts on the displace- 
ment by a direct method, i.e. one which does not explicitly involve 
the characteristic coefficients at all. Taking (14.27) and (35.10) into 
account, a quite elementary geometrical argument yields the rela- 


tion y’ =cf' +yité); (106.14) 


where y’ now stands for the quantity hitherto denoted by y,. 

Recall that we can here write e’ or é’ for the displacement, which- 
ever is the more convenient, and in (106.14) we made the second 
choice; cf. the discussion following equation (35.11), as well as that 
leading to equation (41.3). Now, paraxially, writing cf, cf for the 
effective coefficients, 


e’ = tf Hy +cfy, =e! = ctlcy’ + ay tty +cty,)] + cry, 
(106.15) 
where (106.14) has been used on the right. It follows at once that 
6 n ak % 
c¥ A er cut aan (106.16) 


a %? x? 
I—qcy I—qey 
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The primary and secondary chromatic relations therefore are 


A. 
c= ccf, ct = c#+qc¥, (106.17) 
1 1 a2 1 


A = Pe 
af = ole + get"), ef = Of + et + Pet) + pet. (106.18) 
1 


As a matter of fact, to obtain even the expression for ¢¥ just given, 
2 


starting from cf as it appears in (106.11) as the factor multiplying 
2 


y. there, is quite a laborious task; compared with which the work 
involved in the method just used is almost negligible. 

As already remarked, it will generally be good enough to consider 
the non-paraxial coefficients only when “1 and ca can be taken as 


effectively zero; and then the effects of stop shifts on the primary 
chromatic third-order coefficients are just as simple as those on the 
monochromatic third-order coefficients. 


107. Offence against the sine-condition 


It is natural to inquire how one can incorporate chromatic effects 
into our previous investigations of total circular coma in Chapter 
5A. Fortunately it turns out that the ‘monochromatic equations’ 
are of a kind almost ideally suited to the present context. To begin 
with, we take the aberration function v to be modified in the sense 
implied by equation (105.1). Further, equation (100.5) is unneces- 
sarily clumsy: it is much better here to incorporate v in the re- 
maining terms on the right, so that we again have (26.1), but now 


S(é) =D 5nS"", C(é) = & ¢,8", (107.1) 
n=0 n=0 
i.e. the summations extend from m = 0 rather than n = 1, as they 
did in (26.2). Also, we have written ¢,, in place of c,, to avoid con- 
fusion with the constants which occur in (103.1). Thus 


59 = $C, Co = Ca, (107.2) 


and thes, €, of course all depend upon w. This incorporation of the 
paraxial terms in the spherical and circular comatic aberration 
functions is quite consistent with our general classification of 
Section 22, for the types of aberrations governed by c, and cy clearly 
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have the required character. In particular, c, is acomatic coefficient, 
for the zonal curve, which is in fact a single point, lies asymmetrically 
with respect to the ideal image point. 

Granted, then, that every displacement under discussion is 
understood to include the part induced by v, everything remains 
essentially as before. The only formal change is the replacement of 
8 throughout by «8, which entails that in relations such as (31.5) m 
must be replaced by m/x. The central result of Section 31 then re- 
tains its previous significance, i.e. Kh’ is the exact total sagittal 
circular comatic displacement for whatever colour w to which the 
various quantities in (31.5) relate. This conclusion is not entirely 
trivial. It is true that we could use (31.5) as it stands for any colour 
w we please, but in so doing we would be considering the displace- 
ment in the variable ideal image plane corresponding to w rather 
than in a fixed image plane. 

For the sake of illustration suppose that, with an appropriate 
choice of A, K is so designed that paraxial rays of the colour A, unite 

0 


in J’. This means that one has complete paraxial simple achroma- 
tism: ¢,(W,) = ¢,(@,) = 0. At these colours A and A, the values of 
0 


the comatic displacements then differ from each solely on account of 
the variation with w of the third-, fifth-, ... order terms of S and C; 
but at intermediate colours the effects of residual (paraxial) chroma- 
tism are correctly taken into account. Again, from a slightly 
different point of view, one has quite generally 


K = K+wi(C—2£RPPS)o+..., (107.3) 
0 o1 0001 


and one will achieve a high state of chromatic correction of the 
circular comatic asymmetry to the extent that one can cause the 
factor multiplying w to vanish. 


108. Reversibility and concentricity 


A moment’s reflection shows that the angle characteristic of a re- 
versible symmetric system K must satisfy the identity (54.2) for 
all colours, bearing in mind that « = 1 here. Paraxially one has 


T —d¥a = he,£+c,9 + $e,6 + 4d*6. 
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Using (15.11) the equality of the coefficients of £ and ¢ immediately 
gives the relation 


(1 —s?) cy +2(1 —sm) c,+(1 —m*) cg = 0. (108.1) 


For general values of s and m this is not very interesting on account 
of the presence of the coefficient c,; which does not enter into the 
paraxial pseudo-displacement. Only when m? = 1 does one get a 
useful result, namely when m= +1: 


(145) C,+2¢, = 0. (108.2) 


In particular, when the system is completely reversible the (charac- 
teristic) coefficient of transverse chromatic aberration vanishes for 

Il values of w: 
i Co=0 (m=1,2,...). (108.3) 
m 


The exact paraxial displacement is then 


€, = 20,(1 +6, —Cg—2CyCg)*[(1 — Cs) Yet gyi]. (108.4) 

With regard to the non-paraxial coefficients, reversibility imposes 
precisely all the limitations described in Chapter 6A. (56.4) and 
(59.4), in particular, are the relations between the coefficients of 
coordinate order three, and these split up into the corresponding 
relations for every chromatic order, taken separately. The precise 
limitations imposed upon, say, the primary chromatic third-order 
effective coefficients are, of course, of a rather complicated kind. 
However, this is a somewhat esoteric problem, since in practice 
one will be confronted with more specialized situations in which 
certain coefficients, such as c, or C3, are already known to be of negli- 

1 41 


gible magnitude; and then the work becomes much more tractable. 

We go on to consider the concentric system. It is obvious that 
the argument which led to the appropriate generic form (65.11) of 
T can be taken over into the present context effectively unaltered. 
Of course, the undetermined function of y will in general depend 
now also upon w, and we write it as vG(y, w), the factor v being 
inserted once again merely for convenience. Then 


'T = v[G(x, o) +9'«1 (1 —£)8 + 9(1 — $8]. (108.5) 


One evidently has the usual paucity of independent aberration 
coefficients in every (coordinate and chromatic) order: there is only 


16 BIT 
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one, namely G,,. As an elementary consequence of this (taking the 
m 


case vy = p’ only), if K produces an ideal image of an axial object 
point irrespectively of colour, then all chromatic defects of what- 
ever kind will be absent. 

From (106.4), (106.6), (65.12) and (65.14) we have, paraxially, 


C$ + 20,9 +(cgtm-) ¢ = G,(w) (——294+ 6) —q’k1€ — gf. (108.6) 
We can now express ¢,, ¢,, Cz in terms of G,. Thus, concisely, 
c¢, = (k—m)P 7, 
Cy = —(k—m)(k—s)T, 
Cz = (k—s)* T—(k—1)/(k—m), (108.7) 
where (s—m)?T = G,—(1 —m)/(k—m). (108.8) 


From these we can determine the exact effective paraxial coeffi- 
cients. In particular, if D is the discriminant of the pair of equations 
(106.7), it turns out that 


cf = Dk(k—m) (s—1)T. (108.9) 


Unlike the characteristic coefficient c,, the effective coefficient 
cf vanishes when s = 1. This is as it should be, for the general 
result stated just after equation (67.5), namely, that when the stop is 
central all barred effective coefficients vanish, applies also to the 
chromatic coefficients of all types. 


109. Remark on the so-called D-d method 


We saw in Section 37 that the spherical point characteristic V* had 
one great advantage over V, namely, that the displacement, given 
by t t 

ée = X'——; = oy (109.1) 


is simply the derivative of v' to within a factor which, in all but the 
most extreme cases, is virtually constant for a given object point; 
reflecting the near-constancy of the optical distance between O 
and points of W. It is therefore of advantage to use V* now in the 
context of chromatic aberrations, though the following discussion 
would not alter in essence if it proceeded in terms of V. (109.1) 
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continues to apply, but vt will now contain paraxial terms of chro- 
matic origin. (In order not to encumber the discussion unneces- 
sarily we are supposing that N = N’ = 1.) 
We may write 
Vt(o) = Leet : (109.2) 


For colours sufficiently close to A chromatic defects are contained 
0 


in the term linear in w; so that in a small range of wavelengths 
about w = o we merely require to know the derivative 


V* = (2V*/20) no (109.3) 


Should this happen to vanish for a certain range of values of the 
ray-coordinates then K is simultaneously achromatized with respect 
to all types of aberrations which sensibly contribute to the mono- 
chromatic defects within the coordinate range in question. 

Now let O, D’ be points in ¥ and W, respectively. (The base- 
surface Wo is exactly that of Section 37, and it is the same for all 
colours.) Further, let 2 and &,, be two rays through O and D’, the 


chromatic coordinates having the values o and w respectively. Then 


Vi(w) = a) ds(w), V*t(o) = [xe ds(o), (109.4) 
where ds(w) and ds(o) are elements of arc of R,, and & respectively, 
0 


and both integrals are extended from O to D’. Hence 
Vt(w)—V*(0) = | [M(w) — N(o)] ds(o) + [rw ds(w) 
-[Nw) ds(0). (109.5) 


The last two terms on the right represent the difference between the 
integral of N(w) taken along the actual ray Z,, joining O and D’ and 
the integral of the same function N(w) along another curve (i.e. 
#) joining the same points. When w is sufficiently small the curve 
0 


& is neighbouring to the ray Z,, and, in view of Fermat’s Principle 
0 


(2.8), the two terms in question, taken together, must vanish. At the 


16-2 
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same time dN = N(w)—N(o) becomes Nv, a, because of (101.1), 
0 


so that finally yte { Nv, ds(o). (409.6) 
1 

In practice the most common situation is that the ray in the 
course of its passage through K encounters a number of refracting 
and reflecting surfaces. Some of its segments will be in air (to which 
we ascribe the refractive index NV = 1)and some within homogeneous 
media (e.g. glasses). The length of the segment of - in a particular 


such medium is traditionally denoted by D, and then (109.6) may 
be written wV* =D DON, (169.7) 
1 


the sum being taken over the various ‘elements’ of K. Since it is vt 

rather than V* which is relevant to the displacement, we may sub- 

tract out the irrelevant part g'(¢) of V*. With the convention that 
1 1 


v'(0, 0, €) is to be understood as having been absorbed in g*‘(¢), one 


has og'(6) = DAN, (109.8) 


where d denotes the length of a segment of the ray through E’, w 
of course being zero for this ray. Then 


wot = >) (D—ad) oN. (109.9) 
1 


We denote the sum on the right by Q. Its use is very popular on 
account of the ease with which it may be calculated; and its form 
explains the reason for the name ‘(D—d)-methods’ for procedures 
of chromatically correcting systems on the basis of information 
gained from evaluating Q for a number of rays. 

In practice it has been customary to take for dN the actual value 
of N—WN as given by glass tables, the fact that w may not be ‘small’ 

0 


(less than 0.1, say) being disregarded. In so doing one calculates 
a quantity which differs from 


Q = wy(D—d) Nr, (109.10) 
0 


by an amount whose dominant term is (at least) quadratic in w. 
This difference will, however, be very small, just because w was 
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so defined as to lead to the rapid convergence of the series (101.1), 
which certainly ensures that |v, | > |v, w?|. Granted this, the use 
of the exact values of dN implies the abandonment of consistent 
definitions in the context of aberration coefficients, without any 
corresponding gain with respect to exact inferences which may be 
drawn from the calculated values of Q. The reason for this state of 
affairs is that in any event we can only assert that 


Q = wet + O(2,), (109.11) 
1 


on account of the last two terms on the right of (109.5) which were 
rejected. These contribute an amount which itself varies with the 
second and higher powers of w: and that is all we know of it. In 


short, we can, in general, draw no quantitative inferences about vt 
2 
from a knowledge of Q alone. 


It may be apposite to mention at this point that one occasionally 
comes across claims which seem to contradict what has just been 
said. These are usually supported by evidence which revolves 
about the results of computations relating to systems actually en- 
countered in practice. The reason why the latter fail to expose the 
fallacies involved is that, as the result of much empirical experience, 
systems are deliberately designed from the outset in such a way that 
rays of different colours, connecting the same pairs of points, will 
not differ widely from each other anywhere in the system. In other 
words, one aims, by prescription, at keeping down the value of the 
sum of the terms rejected in (109.5). 

A much-used application of the (D—d)-sum may be illustrated 
by considering a single axial ray, which we may take to be meri- 
dional. When w = 0 the total displacement of this ray is e’(y’). 

0 


Another ray through O with the same value of the coordinate y’, 
but this time having the colour w, will have the same displacement as 


the first ray if V+(y’, 0) —V*(y’,0) =0, 


where a dot denotes differentiation with respect to y’. Retaining 
only the chromatically dominant term, i.e. that linear in w, achro- 
matism therefore obtains at y,, say, when 


QO(y;) = 0. (109.12) 
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Suppose now that actual calculation shows Q to be zero for the ray 
having y’ = y,, say. Then, since Q vanishes at y’= o and aty’ = y4, 
its derivative must vanish somewhere between y, and 0, i.e. there 
must exist a constant 9 between o and 1 such that 


O(Pyg) = 0. (109.13) 
The vanishing of Q at y;, therefore ensures that K zs axtally achro- 
matized for the intermediate zone yy, = Oy,; but, in general, as regards 
the actual value of 0, nothing more can be said. 

The specialization to axial rays is unnecessary; everything goes 
through equally well when y, + o. The same is, however, not true 
for a completely general ray, which is, perhaps, not very surprising. 
The vanishing of the primary chromatic displacement now requires 


Q, (¥ 3) = 0, QO (% 2p) = 0, (109.14) 
and these replace the single condition (109.12). (The fixed co- 
ordinates y, have been left understood.) On the other hand, if 
Q(y4, 24) = 0 we can only infer that there exists a constant 0 between 
o and 1 such that 


Ya Qy (OY a 92a) + Fa Qe (AVa, 2a) = O. (109.15) 
From this (109.14) does not follow, so that in principle no property 


of achromatism can be inferred from the vanishing of Q, calculated 
for a single skew ray. 


Problems 


P.10(i). Derive an expression for the exact paraxial displacement 
e:(yzly1) when the object is at infinity, given equations (106.79). 


P.x0 (ii). Show that the primary chromatic central angle charac- 
teristic of a single spherical refracting surface of unit radius can be 
written in the form 


T = N'vi[(N'—1) +x TA, 

1 0 0 0 
(N has been taken to be unity, and the qualification ‘central’ means 
that both base-points are taken at the centre of the sphere.) 


P.x0 (iii). Given the definition (101.4) of the chromatic coordinate, 
show that one can choose a new base-colour A* and at the same time 
0 
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a new constant «* in place of 3 in the denominator in such a way 
that the new coordinate w* is simply a linear function of the old. 


P.10(iv). The monochromatic offence against the sine-condition 
of a certain system has been removed. Show that to a sufficient 
degree of approximation it will have been removed also for neigh- 
bouring colours if the condition 


C = 2&8 
1 1 


is satisfied. 


CHAPTER II 


ANISOTROPY 


110. Fermat’s Principle and the propagation of light in 
anisotropic media 


Hitherto we have always supposed the media in the object and 
image spaces to be isotropic, i.e. the refractive indices N and N’ at 
points A and A’ of these spaces were taken to be independent of the 
directions e and e’ of the ray through A and A’. On a number of 
occasions it was pointed out, however, that when one discusses the 
imagery of a given system K merely upon the basis of a characteristic 
function whose generic form reflects the over-all symmetries of K, 
the possibility of K being internally anisotropic is by no means 
excluded. In other words, at points within K, N may very well be 
a function of the directions of the rays through them; so that it is 
then a function of the six variables x, y, z, a, 6, y, the last three of 
which are connected by the identity a?+ 67+ y? =1. 

What has just been said can be meaningful only if a characteristic 
function can still be defined when light is propagated through 
anisotropic media. This being so it is incumbent upon us to inquire 
into the actual state of affairs. This chapter is, on the whole, directed 
mainly towards this end; for which reason it will deal with little 
more than general principles. It continues for the time being the 
simple, phenomenological kind of argument which was pursued 
in the early sections of this book; which is, after all, in keeping with 
the simplified picture of the propagation of light which geometrical 
optics presents. 

The essentially new ingredient which has to be added to the basic 
picture presented in Chapter 1 is this: exceptional circumstances 
apart, the light—now supposed monochromatic—traversing a given 
system consists of two distinct parts, which propagate independently 
of each other. We arbitrarily distinguish the two kinds of rays and 
all quantities associated with them by the subscripts A and B. In 
particular, we have two kinds of rays 2, and Zg, which have their 
usual significance. For instance, the tangent to Z, at P gives the 

[ 248 ] 
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direction in which A-light (i.e. radiant energy of the A-kind) is 
being transported at P; with an entirely analogous meaning of 
Rp. 

Now, as we know, if the propagation of light is governed by a 
variational principle, the existence of a point characteristic is 
immediately guaranteed. We are therefore led to ask whether, or 
in what form, Fermat’s Principle survives in anisotropic media. 
The answer to this is quite simple: each kind of ray ts governed by an 
extremal principle exactly of the form (2.8), granted that the optical 
medium as such is, in general, now defined by two distinct refractive 
index functions N_,(x, ..., y) and N;(x, ..., vy). (See also the end of 
Section 113.) Then the ray Z, joining two points A and A’ is that 
curve for which 


A’ 
A 


is stationary in the sense explained at length in Section 2, whilst 

for Bp r 

VE = Neg dsp 
A 


is likewise stationary. (In special cases, e.g. in an electron-optical 
instrument with magnetic focusing, there may be only one kind 
of ray.) The optical distance between two points is not a function of 
these points alone, but depends also upon the kind of light with 
respect to which it is defined; but this is nothing new, for we 
encountered an analogous situation already in the context of 
chromatic effects. 

Since the two optical distances V,(A, A’) and V,(A, A’) which 
are now in hand depend only on the coordinates of A and 4’, 
we recognize that there are now two point characteristics, 


V(x", ',2',%,9,2) and V,(x',y’, 2’, x,y, 2) 


associated with K, one for each kind of light; and they jointly charac- 
terize completely the purely geometrical-optical behaviour of K. 
Evidently when we come to derive the fundamental equations cor- 
responding to (3.5) we can simplify the notation by suppressing the 
indices A and B on most occasions. They are easily restored by 
inspection, should this be required. At any rate, we now see that all 
our previous work remains relevant even when K is internally 
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anisotropic, a state of affairs we had hitherto taken for granted: 
we merely need to understand the characteristic function in every 
case to stand alternatively for that appropriate to A-light and B- 
light. The basic equations (3.5), it should be recalled, have been used 
only in isotropic regions. 

It will be noted that we have here adopted a course different from 
that followed in Section 2, in as far as we have now simply postulated 
the validity of Fermat’s Principle, without first contemplating the 
refraction of light at the boundary between two homogeneous 
media. The reason for this is that the detailed, explicit rules which 
describe such refraction are very cumbersome: it seems better to 
consider the problem after accepting Fermat’s Principle, and this 
we do very briefly in the next section. 


111. Refraction at a boundary between homogeneous 
media 

To deal with the principles of the refraction of light at the boundary 

of two homogeneous media one or other or both of which may be 

anisotropic, we need to consider the state of affairs to which Fig. 2.1 

relates. The optical length V*(APA’) is still given by (2.4), but 

upon varying P we must now write 


dV* = 3(N’'s’)+0(Ns) = N’ds' + Nos+s'3N'+s0N, (111.1) 


since, in general, ON and dN’ will differ from zero on account of the 
fact that the directions of the lines AP, PA’ differ from those of 
AP,, P, A’ respectively, where P, is the displaced point of incidence. 
If we define a vector n as 


a (3 oN aN) 


au? BB’ dy (111.2) 


(111.1) becomes 
oV* = N’e’.ds’ + Ne.ds+s'n’ .de’+sn.de. (111.3) 


Consider merely the second and fourth terms on the right. We have 
sde = ds—eds = ds—e(e.ds). The two terms in question become 


Ne.és+sn.ée = [((N—n.e)e+n].0ds. (111.4) 


At this stage we recall that the variables «, f, y are not mutually 
independent. Indeed, with the aid of the identity a?+ #?+ y? =1 
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which holds between them, we can always arrange N to become a 
homogeneous function of any desired degree qin @, f, y. The choice 

= I entails just that 
q= J e.n=WN, (111.5) 
which is a convenient result. In future we therefore understand 
that all refractive indices are to be written as homogeneous functions of 
degree 1 in a, B, y. (111.3) then becomes 


OV* = n’.ds’—n.ds. (111.6) 


Proceeding now as in Section 2 we obtain the laws of refraction in 
the form 
n’—n = op, (111.7) 
where @ is some scalar factor, since, in view of Fermat’s Principle, 
dV* must vanish for all small displacements of P, which leave this 
point in the boundary. 
For an isotropic medium of constant refractive index N,, say, we 


have to take N = Not + +7, (111.8) 
and then n = N,e(a?+ 62+ y*)-4 = Nie, (111.9) 


and with this (111.7) reduces to (2.6). The simplest formal aniso- 
tropic counterpart of (111.8) is the case in which N? is a quadratic 
form in the components of e. (By ‘formal’ we mean that we need 
not necessarily be able to find a physical realization of such a re- 
fractive index.) Writing ¢,,é,,e, in place of «,f,y whenever 
convenient, we have 


3 $ 
N= ( yi anstsé) : (111.10) 


k,t=1 


where the a,.(= ay) are a set of six constants such that N2 > 0 
for all values of the e,. Now the laws of refraction are 


N'Y, age} — ND aye; = TP ys (111.11) 
i i 


where p = (f,~2, Ps). It is always possible to make a rotation of 
coordinates so that after the rotation the constants in (111.10) have 


the values AEE) 
ay = 
o (k+)D, 


where the a, are non-negative; but this can in general be done for 


(111.12) 
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only one of the media to which (111.11) refers. As a specially simple 
case, let N be given by (111.10), whilst the second medium shall be 
a vacuum, so that N’ = (a+ 2+’). With the coordinates so 
chosen that (111.12) obtains, we therefore have 


e;,-N-la,e, = OP x, (111.13) 


which expresses e’ directly as a function of the components of e and 
p. Moreover, it shows particularly clearly that e, e’ and p will in 
general not be coplanar, for otherwise there would have to be con- 
stants g and q’ such that the left-hand member of (111.13)—or, 
more generally, of (111.11)—takes the form g’e’ — qe. 

As a matter of fact (111.10) is far from academic. The so-called 
uniaxial crystal has the refractive indices 


Ng = bia? + f+ yt, Ng = [bo +a(h?+y%)]8, (111.14) 


where a and bare positive constants, and the axes have been appro- 
priately chosen. Evidently the crystal behaves as an isotropicmedium 
for A-rays. It will be noted that Nis invariant under rotations about 
the x-axis, and also under reflections in any plane containing this 
axis: so that we have a kind of counterpart to the refractive index 
function of an isotropic inhomogeneous symmetric system. A 
particularly simple situation exists when the boundary at which the 
refraction described by (111.13) occurs is a plane normal to the 
x-axis. Then (111.13) becomes, for the B-ray, 


a! —(b/N)x = 0, B' =(a/N)B. (111.15) 


If the incident and refracted rays make angles ¢, ¢’ with the 
x-axis respectively one finds very easily that 


sing’ = a[b+(a—5b) sin? g]-*sin ¢. (111.16) 


Further, in this special case e, e’ and p are coplanar. Thus, the 
angle between the normals of the planes of incidence and refraction 
vanishes if and only if e x e’. p = 0, which implies here merely the 
condition that fy’ — f’y = 0, and this is satisfied. 

These simple remarks will suffice to show how one can deal with 
the passage of rays through homogeneous anisotropic media by 
means which do not differ in principle from those to be used when 
isotropy obtains. 
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112. The point characteristics 


It has already been remarked that when the optical medium is 
anisotropic one is, in general, confronted with two point charac- 
teristics V, and Vg. The time has come to derive equations corre- 
sponding to (3.4) and (3.5) for them. However, according to our 
previous agreement, we can deal with both functions at the same 
time by just omitting the subscripts A and B, and understanding 
every equation in which V occurs really to stand for two such 
similar equations with the indices A and B supplied in the appro- 
priate places. 

An elementary argument entirely similar to that pursued in 
Section 3 can be used to obtain the required result. The construc- 
tion to which Fig. 2.1 relates can be taken over unchanged, and 
everything remains the same up to and including equation (3.2). 
The pair of equations following this is, however, no longer valid. 
The reason is this: the optical distances between B and A on the one 
hand and between B and C on the other (C being the foot of the 
normal from A on to “) were previously taken to be equal. This is 
no longer the case since the lines BA and BC have different 
directions. In fact, in the notation of equation (111.2), if BA =], 


V(B, C)— V(B, A) = In.de = n.(ds—e6l), 


=n.[ds —(e.ds) e]. (112.1) 
There comes 


V(B, A,) — V(B, A) = [n+[N—(n.e)]e].ds. (112.2) 


Evidently we have done little more than to re-derive equation 
(111.4). We now recall our convention that all refractive indices are 
to be written as homogeneous functions of degree 1 in the direction 
cosines. ‘The right-hand member of (112.2) then reduces to n.dés. 
Accordingly, dealing in an entirely similiar way with the difference 
V(C'", B’)— V(Aj, B’), we obtain in place of (3.4) 


oV =n’ .ds’—n.ds. (112.3) 


The basic equations of Hamiltonian optics, including now propa- 
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gation in anisotropic media, are therefore 


ON’ oV aN’ _aV_ @N’_ av 


Ga’ ax’? ~=— OB" By” By" G2” 
oN a WNW WWW 
da  a% OB dy’ dy Oz’ 


(112.4) 


These equations may be resolved algebraically for the components 
of e’ and e. Then the equations e’.e’ = 1 and e.e = 1 yield a pair 
of non-linear first-order partial differential equations which V must 
satisfy. For example, if a homogeneous medium has a refractive 
index given by the second member of (111.14), 


b-tV, =—bta/N, aWV,=—atp/N, a4V,=—aty/N. (112.5) 


Squaring and adding these, one obtains the equation 


Cells ens 


and there is a similar equation in which the coordinates are primed. 
The two, taken together, are satisfied by the function 


V = {0(x' —x)*+al(y' —y)? + (2-2). (112.7) 
Given x, y, 2, the surface V = const. is evidently an ellipsoid of re- 
volution. 

It is possible to define other characteristic functions in much the 
same way as was done in the context of isotropy, though the details 
are more complicated. We do not consider them, however, since we 
shall have no occasion to use them. 


113. Rays and normals 


The consideration of surfaces of constant V is of course in no way 
restricted to the special case (112.7). Given any characteristic 
function we may always consider the family of surfaces V = const. 
regarding either x, y, 2, or x’, y’, 2’ as fixed. Merely from a notational 
point of view we contemplate the latter alternative, thereby avoiding 
the appearance of a large number of primes. At the same time we 
shall with advantage take primed and unprimed variables to refer 
to object and image space respectively. (‘This notation would have 
been better all along in any event—it was the one originally adopted 
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by Hamilton—but tradition dictated otherwise.) If P(x, y,z) is 
any point, two directions of interest are associated with it, namely 
(i) the unit normal u at P to the surface W defined by 


V(x, ¥, 2) = V(x, y, 8); (113.1) 


and (ii) the direction e of the ray through P and the fixed point 
O’(x', y’, 3’). What has to be constantly borne in mind is that e 
and u are, in general, distinct. The only exception occurs when N 
has the form appropriate to an isotropic medium (recall equations 


(121.14)): N = N%, 9,2) (a2 + f2+ yb. (113.2) 


We are of course talking about full (i.e. two-parameter) pencils 
of rays through O’; at certain points of W e and u may happen to 
coincide. More formally, since, according to (112.4), 


n= nu, (113.3) 
where x is the magnitude of n, u and e coincide if and only if 
exn=o, (113.4) 


and if no restriction be placed upon the values of # and y this 
condition is satisfied only when (113.2) obtains. As a somewhat 
trivial example of the coincidence of u and e for isolated rays we 
infer from (112.5) that in the case of the uniaxial crystal (Nz given 
by (111.14)), @x U = 0 only if either « = oora=1. 
We return briefly to the law of refraction (111.7). We may write 
it as aa 
n'u’—nu = op. (113.5) 
If desired this may be regarded as governing in the first instance 
the refraction of the normal u, rather than of the ray e. In that case 
the angles ,,, I, of incidence and refraction of the normal formally 
aa n’ sin I, = nsin I, (113.6) 
whilst u, u’ and p are coplanar. However, the apparent refractive 
index m (sometimes called the wave-index) depends upon the direc- 
tion of the ray, and the simplicity of (113.6) is deceptive. The quan- 
tity NV, here simply called ‘refractive index’, is more often referred 
to as the ‘ray-zndex’. Our terminology is appropriate to geometrical 
optics, bearing in mind, on the one hand, the central position 
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which N occupies as the function to be varied in the basic integral 
f Nds, and, on the other, that one is ultimately interested in the 
direction of transport of light rather than in formal constructions. 


114. Remark on the occurrence of anisotropy 


Although it is not our purpose to study the details of the propa- 
gation of light in anisotropic media, it is convenient, for the sake of 
orientation, to remark upon situations in which anisotropy is 
encountered in practice, and to comment upon the form of the 
functions N, and Nz. We have already had one special example, 
i.e. that of the homogeneous medium to which (111.14) relates. 
This may be expected to be a degenerate case of a more general 
medium, on account of the fact that only two constants occur in 
N, and N, together. This is indeed the case. ‘Thus, the so-called 
biaxial crystal is a homogeneous medium for which 
N? 
Na} = HO+ da? + (c+a) BP (a+8)y*+ [6 —c)'ah+(c—a)* ft 
+-(a—b)? 4 —2(¢— a) (a—b) f*y*—2(a—b) (0-0) Po? 
—2(b—c) (c—a) af? }3}. (114.1) 
Here a, b, c are positive constants, and we may suppose that 
az>bee. (114.2) 


This can always be achieved by choosing an appropriate orienta- 
tion of the coordinate axes. In any event, it is important to realize 
that the comparative simplicity of (114.1) may be destroyed by 
allowing the coordinates to undergo some general rotation. In 
that case the expression in square brackets for instance becomes a 
more general quartic form; see equation (115.33). 

When b = c (114.1) is easily seen to reduce exactly to (111.14). 
Also, we note that N, = Nz for those rays for which the square 
root in (114.1) vanishes. One most conveniently replaces a? by 
1 — f2—-y* in the latter, and it then turns out that the rays in question 


h 
wee = ([(a-)(a—o)]4, 0, [(6—)/(a—o)]). (214.3) 
These directions, of which there are two, define the so-called 


optical (ray-) axes of the crystal; and this explains the qualification 
‘biaxial’. 
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So much for homogeneous media. A prominent example of an 
inhomogeneous anisotropic ‘medium’ is that of the electron- 
optical system, as has been mentioned before. As with light optics, 
the geometrical theory can be used in circumstances such that 
diffraction effects are negligible. For scalar electrons there will 
certainly be only one refractive index; and when axial symmetry of 
the system obtains, it has the generic form 

N = (0 + B+ yt N*(&, 7? + 3) + (yy — Ba) N*(%, 9° +2), 

(114.4) 

in a notation closely adapted to that of equation (72.1). The ‘skew 
refractive index’ N* vanishes in the absence of a magnetic field, 
whereas the absence of the electric field entails the constancy of the 
‘normal refractive index’ N*. The equations for V are 

V24+V24V2+2(20,—yV,) N* + (y2+22) #2 = N*, (114.5) 
together with that in which all variables are primed. The presence 
of the factor zV, —yV, reflects the necessity for the explicit occur- 
rence of the skew-symmetric invariant 7 in (72.1). 

More complex instances of anisotropy arise for instance when 
optical media are placed into electrostatic fields, or are subjected 
to mechanical stresses. When an initially optically isotropic medium 
—optical isotropy does not necessarily require that the medium be 
amorphous, i.e. non-crystalline—is elastically deformed, it will in 
general become anisotropic. If the deformation is non-uniform the 
medium will then no longer be optically homogeneous or isotropic. 
Still, in principle one need only know the functions N, and Nz: 
once they are available one can hope to gain insight into the general 
optical behaviour of the system in question without having to bother 
with cumbersome and in the first place irrelevant detail. The 
material of Chapter 7 is a good illustration of this remark. There 
the generic form of (72.1), which could also be taken to be a direct 
consequence of (114.4), told us a good deal in a very simple way 
about the imagery associated with axially symmetric electron- 
optical systems; these being merely special semi-symmetric 
systems. At the same time we were not overwhelmed by having to 
consider the very complex relationship which exists between the 
explicit form of V of any particular system and the disposition of 
electric and magnetic fields within it. 
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115. Considerations relating to Maxwell’s equations 


With the preceding section we have essentially completed our 
treatment of the generalities of Hamiltonian optics. It may, how- 
ever, not come amiss to deal very briefly with the relation between 
geometrical optics and the electromagnetic equations, if for no 
other reason than to amplify a little the remarks of Section 1. The 
material to be presented is to some extent out of keeping with the 
more phenomenological character of the rest of this work. For this 
reason it will be kept as brief as possible, partly by remaining on a 
mathematically superficial level, and partly by allowing ourselves 
to be satisfied with considerations of plausibility rather than with 
strict and detailed proofs. In this way we may hope to gain insight 
into the problem as easily as possible. 

The propagation of electromagnetic fields through a non- 
absorbing, source-free medium is fully described by Maxwell’s 


equations 
ccurlE = —@éH/ét, ccurlH = eD/ét, 
(115.1) 


divH = 0, divD = o. 


It has been assumed that the magnetic permeability is unity, so 
that no distinction need be drawn between B and H. To make 
(115.1) determinate one still has to add an equation giving the rela- 
tion between D and E. In general this relation may be non-linear, 
but we shall suppose this not to be the case. Then one has 


Dy, = Ex %, Y, 2) Ey, (115.2) 


where the ¢,, are six given functions, for one can show that €,, must 
be identically equal to €,,. Note that we are using the summation 
convention: if an index is repeated in any expression, summation 
over that index over the range 1, 2, 3 is understood. 

Let the radiation be monochromatic, wavelength A. The ¢éj,, 
which may be functions of the wavelength, are to have the values 
appropriate to A. Then write 


D = D,(x,y,2)®, E=E,(*,y,z)®, H=H,(x,y,2)®, (115.3) 
with _ ® = exp [(2mi/A)(V —ct)], (115.4) 


where V is some function of x, y, z yet to be determined. For the 
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purpose of orientation, consider the last of equations (115.1). 
With (115.3, 4) it becomes 


Adiv D,+2mD,.grad V = o. (115.5) 


The transition to the geometrical-optical limit corresponds to 
allowing A to tend to zero. In so doing it is assumed that A divD, 
tends to zero; or, more generally, that the derivatives of the com- 
ponents of the amplitude vectors do not tend to infinity as fast as or 
faster than A-1. In a more realistic picture the assumption is that, 
for values of A to be contemplated, terms (in equations such as 
(115.5)) which have A, or some higher power of A, as factors can be 
neglected when compared with the terms independent of A. 
Physically this assumption is not justified, for instance, at the rim of 
a stop, to give but one example; a state of affairs which, in a sense, 
ultimately reveals itself in the fact that the shadow of a sharp edge 
does not itself have a sharp edge. 

Dealing with the remaining equations of (115.1) in the manner 
just discussed, the vector grad V turns up again and again, and we 
therefore write 


grad V =n (115.6) 

for convenience. Then we have 
nxE,=H,, nxH, =—D,, (115.7) 
n.Hy = 0, n.Dy =0, (115.8) 


in the formal limit A > o. Note that equations (115.8) are already 
contained in (115.7). The local energy density W in the medium is 


piven Dy W = 3(E.D+H.H), (115.9) 
whilst the energy flux (Poynting vector) is 
S=ExH. (115.10) 


(The usual factors 1/47 and c/4m7 have been omitted from the 
right-hand members of the last two equations. These factors can 
always be removed by a suitable choice of units.) Then 


E.D =E,.D,®? = —nx Hy.E, ©? = n.E, x H,®2, 
and H.H = (nxE,).H,®2, 
so that W=n.S = Sn.e, (115.11) 
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where S is the magnitude of S and e is the usual tangent vector 
along the ray at the point considered, since, by definition, e is in the 
direction of the energy flow. Writing 


N=W/S, (115.12) 
(115.11) becomes n.e=WN. (115.13) 


The differential equation for V follows directly from (115.7). 
Inserting the first of these equations into the second, there comes 


(n.E,)n—77E,+ Dy =0, (415.14) 


where is the magnitude of n. Writing this in component form and 
using (115.2) one gets the set of equations 


(1,1 + €jq —N? Og) Eq = 0, (115.15) 


where the Kronecker delta 6,; has the value 1 or o according as k 
is or is not equal to /. These equations are homogeneous, and so have 
a non-trivial solution only if their discriminant vanishes: 


det (1,1 + €jy —170jy) = O. (115.16) 


This equation is quadratic in n?. Upon solving it for n? one has, in 
effect, two distinct equations (115.14), and therefore two functions 
V each of which satisfies (115.6). 

It is possible to express N explicitly as a function of e (and, 
of course, of x, y, 2) in the following manner. If one multiplies the 
first of equations (115.7) vectorially by Eg, one obtains a linear rela- 
tion between S, E, and n, which means that e, n and E, are coplanar. 
So, however, are Dy, n and E, also, according to (115.14). It follows 
that e, D, and E, are coplanar, i.e. there exist constants a and b 
such that @ = aD) +E, 
Scalar multiplication alternatively by e and n allows us to deter- 
mine a and b, and then 


D, NE, 
—@.D,n.E,’ (115.17) 


where (115.8) and (115.13) have been used. Scalar multiplication 
of (115.14) by e yields the relation 


e.D,y = —Nn.Eo, 
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and on combining this with (115.17) there follows the result 


Here it is convenient to invert the relation (115.2), i.e. to take the 
latter in the equivalent form 


Ex = Nel, y, 2) Dy (115.19) 
where Ex Iim = Om: (115.20) 


We note in passing that if ¢ and 9 are the determinants of €,, and 


Nr Tepectively, then ey = I. (115.21) 


Equation (115.18) now becomes 
(N*M + €x€; — Op) Dex = ©, (115.22) 


which should be compared with (115.15). Here again the discrimi- 
nant must be zero, i.e. 


det (N79;3+ €,€;— Oy) = 0, (115.23) 


which is an equation for N?. It is merely a question of algebraic 
detail (involving the use of (115.21)) to show that (115.23) may be 
eee N4 + (€,3€;,€,— tr €) N? + en ,,€;,.€, = 0, (115.24) 
where tr é = €4, + €9 + €g3. (115.24) is the final form of the equation 
which allows us to write down N? as a function of the €,.; and here 
also we have two possibilities. (It is left understood that the last 
term on the right and tre must each be supplied with an additional 
factor €n €n( = 1) to make NV homogeneous of degree 1.) 

Return now to equation (115.13). If Grad N is the vector whose 
components are V,, Nz, N,,, just as grad N denotes the vector whose 


Y 
components are N,, N,,, N,, we obtain, by differentiation, 


n.de+e.dn = Grad N.de+grad N.dx, (115.25) 


dx being the vector joining arbitrary neighbouring points. On the 
other hand, using (115.6), 


= _ Om, , Om, _ , dn 
e.dn = e,dn, = Cn Oe, dx, = ae = aX. (115.26) 


where d/ds stands for the derivative along the ray. (115.25) now 


becomes = (dn/ds—grad N).dx =(GradN—n).de. (115.27) 
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Observe now that in this section the quantities NV, V and e were 
introduced ab initio, that is to say, as they arose in the context of 
Maxwell’s equations. Complete correspondence with the equations 
of the previous sections of this chapter is now achieved by identi- 
fying N with the refractive index as it occurs in Fermat’s Principle, 
and V with the point characteristic as it occurs in equation (113.1). 
If this be done the first triplet of the equations (112.4)—with primes 
omitted—entails that the right-hand member of (115.27) vanishes, 
and then, since dx is arbitrary, we must have 


d(Grad N)/ds = grad N, (115.28) 


which is just the equation of the ray in the form which emerges 
directly from Fermat’s Principle. 

Having established the interpretation in terms of Maxwell’s 
theory of various quantities which first arose in the context of geo- 
metrical optics as such, we can go on to attach a more detailed 
physical meaning to the notion of the ‘two kinds of light’ introduced 
in Section 110; to relate the refractive indices N., and Nz to the 
components of the dielectric tensor é,;, defined by (115.2); and so 
on. Thus, very briefly, one has just two refractive indices because 
(115.24) is quadratic in N?; and that this is so is in turn related 
to the fact that any electromagnetic wave can be regarded, at any 
point, as the superposition of just two independent linearly polarized 
components. As a matter of fact, given the unit vector u(= n/m) at 
a point, there will be two vectors E, (determined to within a scalar 
factor) which satisfy (115.15), namely one for each of the two values 
of n?, n®, and n2, say, which satisfy (115.16). The correspond- 
ing vectors D, , and Dy, then turn out to be mutually perpendicular. 
To show that this is so, introduce D, in place of E, in (115.15), 
which becomes, in view of (115.20), 


(Hem — Ue im —2 Om) Dom = O- (115.29) 


Now rotate the Cartesian axes so that at the point under considera- 
tion the transformed components of u are (1,0, 0), and then, since 
u.D,y = 0, one also has Do, = 0. (115.29) then reduces to 


(Nem — 1 Sm) Dom = © (115.30) 


all quantities referring to the new axes, of course. We therefore 
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have the pair of equations 
NimDoma = "4" Dons, em PomB = np Dons: 
Multiply the first of these by Do; and the second by Dy; 4 (summa- 
tion over k being implied as usual). Then, upon subtracting the 
resulting equations from one another, and bearing in mind that 
Nim = mx it follows that 
(m4? —ng*) Do4.Dop = 0. (115.31) 
Granted that 1, + mg, this constitutes the required result, for it is 
invariant under rotations of the coordinates, so that the special 
choice of axes which was made in the course of the proof is irrele- 
vant. 
The usual interpretation of N follows from (115.12), i.e. the 
ratio of the local speed with which energy is transported to the speed 
of light is 1/N. It remains to consider for a moment the relationship 


of N to ¢. Writing Ey = Ow tt €—€y, (115.32) 
(115.24) gives in the most general case 
2N2, 


2 = G01 + [Ema — 4M mn) Cements (115-33) 
2NR 


Purely formally, this equation is more general than that which 
appears in the customary accounts of crystal optics. The reason 
for this is as follows. When the medium is homogeneous it is pos- 
sible to arrange one’s coordinate axes in such a way that, relative 
to them, €,; = o when k + /; and when this has been done we write 
E11 = €4, Cag = €o, Egg = €3 (€ > 0). In other words, one can always 
make a principal axis transformation, meaning a rotation of co- 
ordinates such that, after it has been carried out, the axes are along 
the principal axes of the ellipsoid whose equation, before the rota- 
tion, Was €,;%,x; = 1. However, we have throughout considered 
inhomogeneous media, 1.e. the €, were given functions of x, y, 2, 
rather than mere given constants. ‘This means that although one 
can make a principal axis transformation at any selected point, the 
refractive indices will retain the general form (115.33) elsewhere; 
cf. the remark following equation (111.12). If we do make a principal 
axis transformation at some point, then (115.33) reduces there 
exactly to (114.1), if we make the identification 


(a, b, c) = (4, €9, €3). (1 15.34) 
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A general medium therefore behaves locally either as an isotropic 
medium, a uniaxial crystal, or a biaxial crystal, as the case may be; 
but, whichever alternative applies at any selected point, the same 
alternative need not apply at neighbouring points. 


Problems 
P.11(i). Show that Fermat’s Principle yields the equation 


d(Grad N)/ds = grad N 


for the equation of the ray, d/ds denoting the derivative in the direc- 
tion of the ray. 


P.11 (ii). Two uniaxial crystals have a common plane boundary, 
and both optical ray-axes are along the normal to the boundary. 
If ¢ and @’ are the angles of incidence and refraction of a B-ray 
(for which Np is given by (111.14)), show that 


sin f’ = ab’? sin d{a’b + [a’2(a—b) + a2 (b’ —a’)] sin? g}-4. 


CHAPTER 12 


THE COMPUTATION OF ABERRATION 
COEFFICIENTS 


116. Preliminary remarks. Lagrangian and Hamiltonian 
methods 


To solve a particular design problem in practice one requires the 
actual numerical values of the aberration coefficients of systems 
whose physical constitution is given in detail. We therefore need 
to inquire in the first place how such numerical values might be 
calculated. Now, we have before us a task so extensive that we can 
do little more than to present, a mere intimation as it were, of 
how we might go about carrying it through; and this is quite 
appropriate to the context of this work, which was from the outset 
intended to concern itself with general principles rather than with 
the solution of particular problems. In any event, there arises a 
general issue of policy which mz 7 be described as follows. 

The passage of families of rays through a given optical system 
may be analyzed in two ways which differ from each other in the 
manner in which particular rays are thought of as selected. Thus one 
may either specify two points A and J’, one in the object space and 
one in the image space, and then seek to discover the directions at 
A and A’ of the ray & through these points; or one may select a 
ray & by prescribing, in the object space, a point A on it and its 
direction there, and then seek to find, in the image space, a point on 
& and its direction at this point. A precisely analogous situation 
exists in dynamics. In describing the motion of a particle P we can 
ask the question: if P was at the point A(x, y, 2) at time ¢, and it was 
at the point A’(x’, y’, 2’) at time ¢’, what were the components of 
momentum at A and at A’ respectively? Alternatively we may ask: 
if P was at A at time ¢ and its momentum was then p, where will P 
be at time ¢’ and what will its momentum p’ be at that time? 

In the optical context we know, of course, that the first of the 
alternatives described above amounts to requiring the determination 
of the point characteristic V; and in the dynamical context one 
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likewise has to find a function called Hamilton’s Princtpal Function 
(usually also denoted by V) which is the precise dynamical analogue 
of the point characteristic. In either case a method of solving a 
given problem, optical or dynamical, which rests upon the calcula- 
tion of V is called a Hamiltonian method of computation. On the 
other hand, if a dynamical problem is to be solved in which the 
initial configuration and state of motion are prescribed, then this is 
achieved essentially by solving Lagrange’s equations of motion; 
and one thus provides the answer to the second of the alternative 
questions above. Correspondingly, one has to solve, in the isotropic 
optical case, the differential equations d(Ne)/ds = grad N, or what 
amounts to a set of finite difference equations, based on (2.6), when 
the optical system consists of a set of homogeneous media; the given 
initial conditions leading to a unique solution. It is therefore natural 
to speak of this type of procedure as a Lagrangian method of com- 
putation in the optical context. It is at this point that we have to ask 
ourselves whether it is the Hamiltonian or the Lagrangian method 
which is to be preferred from a purely computational point of 
view : one suspects it is the latter, simply on the grounds of analogy 
with the dynamical case. There ce : be little doubt that ‘everyday’ 
dynamical problems are most easily dealt with by solving Lagrange’s 
equations rather than by using the Hamilton-Jacobi theory, which 
essentially revolves about the Principal Function. Moreover, the 
specification of rays by coordinates all of which lie in the object 
space is undoubtedly convenient in practice. 

Some years ago (1954), in a monograph entitled Optical Aberra- 
tion Coefficients, and in a sequence of thirteen papers under the same 
general title, which appeared during the years 1956—67 in the Journal 
of the Optical Society of America, I gave a fairly detailed account of 
the Lagrangian method as applied to the symmetric system. (All this 
material has recently been reissued in one volume by Dover 
Publications, Inc.) It is based on the use of so-called quast-invari- 
ants (the paraxial limit of which is essentially given by the joint 
paraxial invariants (10.12), see Section 119), and it leads to a syste- 
matic iterative procedure which allows one to compute with com- 
parative ease the aberration coefficients of the various orders. 
Specifically, it was shown in explicit detail—which includes 
numerical examples—how to compute (i) all (effective) monochro- 
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matic aberration coefficients of orders 3, 5 and 7; (ii) the (monochro- 
matic) coefficients of spherical aberration and circular coma of order 
9 and of spherical aberration of order 11; (iii) the paraxial chromatic 
coefficients; (iv) all primary chromatic third-order coefficients; 
(v) the coefficients of secondary chromatic third-order and primary 
chromatic fifth-order spherical aberration. It must be emphasized 
that it is quite feasible to compute, for instance, all the monochro- 
matic ninth-order coefficients: the equations relating to these turn 
out to be still quite manageable, and hold no terrors for a high- 
speed digital computer at all. I further showed how one may 
compute the (exact) first and higher derivatives of the coefficients 
of the various types and orders with respect to the parameters 
which define the constitution of the system, i.e. the axial and extra- 
axial curvatures, the separations, and the refractive indices. The 
availability of these derivatives is of great value in the adjustment 
of aberrations, particularly if this is to be achieved by varying several 
parameters simultaneously and not necessarily by very small 
amounts: whereas the experienced designer will often know what 
to do in the context of the monochromatic and the paraxial chro- 
matic aberrations of systems of spherical surfaces, this is probably 
not the case to the same extent when it comes to aspherical surfaces 
and non-paraxial chromatic aberrations. These observations do not 
necessarily lose their validity in the context of automatic design 
programmes, especially if economy of machine time, as well as the 
need to achieve an absolute, rather than a relative, optimization 
are taken into account. 

The great simplicity of the Lagrangian method of computation 
leads me to believe that it, rather than the Hamiltonian method, 
is best adapted to the problem of practical calculations: though this 
is admittedly a mere expression of opinion. Furthermore, one can 
quite easily use Lagrangian methods to calculate characteristic 
aberration coefficients. To be a little more specific, it is convenient 
to refer to the monograph and the various papers already quoted as 
OAC, OACT,..., OAC XIII respectively; and to numbered equa- 
tions of these merely by the additional prefix O, I, II, ..., XIII. The 
coordinates of the Lagrangian theory may be chosen in such a way 
that, e’ having been computed to the required order, a rudimentary 
integration gives the aberration function ¢ to within what is essen- 
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tially the ubiquitous function g(f) we have so often encountered 
before; cf. Sections 49, 50 and 205 of OAC, and also OAC VI. 
Various alternative procedures are possible such as those which 
avoid the appearance of g(¢) (see Section 2056 of OAC), or which 
concern themselves directly with the deformation or retardation of 
the wavefront. With regard to the latter OAC VII may be con- 
sulted, but it should be noted that the retardation R considered 
there is not identical with the retardation A of Section 38. In fact, 
in terms of Fig. 5.1, R = Q'G, so that, recalling (38.3), R is 
identical with v, i.e. the aberration function associated with the 
function V(%’, 9’, 2’, y, 2) of equation (38.1). Further, on account 
of (39.3), A—R = O(10). Attention must also be drawn at this 
point to the fact that throughout the Lagrangian developments 
just described the sign conventions differ from those used here. 
To achieve consistency one has to reverse, in the former, the signs 
of £, y and every object or image height, and therefore of every 
(actual) displacement. 

Even though it be granted that Lagrangian methods of computa- 
tion are to be preferred in practice it will not be out of place to 
present here a brief treatment of one possible procedure of calcu- 
lating characteristic functions, that is to say, the coefficients of their 
power series, with exclusive concentration on the monochromatic 
angle characteristic of symmetric systems, save for a brief remark 
relating to the point characteristic at the end of Section 123. To 
do more would surely lead us altogether too far afield and would 
perhaps be unjustified in view of the opinions already expressed. 
At any rate, there will be strong echoes of OAC; and we shall have 
the opportunity to become aware explicitly of some of the sources 
of irrelevant complexity of Hamiltonian methods, such as the fact 
that one is precluded from proceeding step by step from object 
space to image space, or the occurrence of the variables 6/« where 
one would like 8 to appear; and so on. 


117. The angle characteristic of a surface of revolution 

Considered geometrically, every characteristic function F repre- 
sents an optical distance. Let there be given a symmetric system K 
consisting of homogeneous media whose mutual boundaries are 
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surfaces of revolution -¥;(j = 1,...,k), and let B;, Bj (Bj = Bys13 
j =1,...,) be a suitable set of points upon the axis ./ of K. Then 
we may regard the characteristic function F(B, B’) referred to the 
base-points B(= B,) and B’ (= B;) as the sum of the appropriate 
optical distances referred to the base-points B,, Bj: 


k 
F(B, B’) = S; FB, Bi) (117.1) 
j= 


and we speak of F, as the ‘characteristic function of -¥;’. The sum 
on the right of (117.1) is in the first place a function of all the vari- 
ables q,; (¢ = 1,...,4;f =1,...,k), using the notation introduced at 
the end of Section 6. Eventually, therefore, all the variables other 
than 931, Ya1» Giz, aNd gp; will have to be eliminated; and when that 
has been done, F will be the required characteristic function of K. 

Evidently the first task to be accomplished is to obtain an expres- 
sion for F;; but before this can be done a definite choice of the 
various base-points must be made. Amongst the possibilities which 
naturally commend themselves are the points O9;, Oo; and the 
points E,, E; in which paraxial rays through O, and through the 
axial point of the plane of the stop respectively intersect .; where, 
of course, the points Oo,, Oo, £1, Ej, are those hitherto denoted by 
Oy, Oo, E, E’ respectively. One will naturally want to take O,; as 
the first anterior base-point, but when F is V, the choice of Oo, 
as first posterior base-point is precluded. One might take Ej, but 
then the corresponding choice of Oo, and £3 as the next pair leaves 
that part of the ray which lies between the normal planes through 
FE}, and Op, unaccounted for. In short, the missing segment must be 
restored in some way, and here we already have one complication, 
albeit not a serious one. If F is T we have no such difficulty: we 
can take B,, B; to be Oj;, Oo;. Only when %, is plane would there 
appear to be a difficulty. This is, however, easily overcome by taking 
SF, to be spherical, and letting its curvature c; tend to zero only 
at the end of one’s formal work. There is, furthermore, another, 
much greater, advantage which 7; has over V;: whereas T; can easily 
be obtained in closed form for various surfaces of practical import- 
ance, for spherical surfaces in particular, one must necessarily be 
content with V; in the form of a power series. To obtain the latter 
is, moreover, no easy task. 
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With the preceding remarks in mind, we go on to consider the 
problem of obtaining the form of T; for a surface of revolution. Since 
we are only concerned with a fixed value of 7, this index may be 
omitted in the remainder of this section. We take local axes x, y, z 
which have the usual orientation, their origin being at the axial 
point (the pole) A of Y. The coordinates of the point of incidence 
P of the ray & shall be xp, yp, zp, whilst the equation of Y will be 


written as x = 45(y?+ 2%), (117.2) 


with s(o) = o. 

The following notation is very useful. Let some quantity, what- 
ever it may be, which is associated with & have the values X, X’ 
respectively before and after refraction of 2 by Y. Then we write 


AX = X'-X, VX =X LX. (117.3) 
(2.1) and (2.6), for instance, now read 
ANe = op = pAN cos J. (117.4) 


If the base-points have the (x-) coordinates q and gq’, an elemen- 
tary geometrical argument shows that 


T = AN[(q—*p)% —(yph +2py)}}- (117.5) 

Next, we have from (117.2) 
p = R-(1, —Syp, —S2p), (117.6) 
where § = ds(u)/du, u=yp+ eh, (117.7) 
and R=1+u8, (117.8) 


Upon scalar multiplication of (117.4) throughout by p, using (117.6) 
on the left, one gets 


o = A(Ne.p) = R1A(N[a—S(yph + 2py)]}; 


whence —A[Mypf+2py)] = (Ra —ANa)/s. (117.9) 
On the other hand, the x-component of (117.4) gives the relation 
o = RANa, ' (117.10) 


and with this (117.9) becomes 
—A[N(ypfh+2py)] = (R?—1) (ANa)/s = usANa, (117.11) 
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in view of (117.6). Inserting this in (117.5) there comes 
‘TT = A(Nqa) + (us —4s) ANa. . (117.12) 
It remains to express the function of u which appears on the right as 
a function of the components of e and e’. 


To begin with, taking the scalar product of each member of 
(117.4) with itself, we have 


o? = N24 N2—2NN’ (ax’ + BA’ +yy’), 
whence o = (N’—N)(1+2«Kx)3, (117.13) 
where k = NN'/(N’-—N)?, (117.14) 
and ¥y is defined by (65.10). Then (117.10) becomes 

ui? = [(AN)(ANa)}*(x+2Kx)—1 =, (117.15) 


say. The left- and right-hand members of this equation are respec- 
tively functions of u alone and of the direction cosines alone, so that 
u may be directly obtained as a function of the direction cosines. 
Upon inserting this function in (117.12) one finally has the explicit 
form of the angle characteristic. 

‘The most important specific example is that of the spherical surface, 


of radius 7, say. Here $= 2fr—(A wh, (117.16) 
whence y = u(r? —x), 
and. us —}s = r[r(r?—u)-* —1] = 7[(1 +t 1]. 
Using this result in (117.12), there comes 
T = (AN) (1 +2Kx)8 + A[N(q—7) 2], (117.17) 


and this is, of course, consistent with (65.11). 
Another very simple example is provided by the paraboloid 


$s = u/2a, (117.18) 


where a is a constant. Then 


u=4ary, us—its=u/4a = ay, 
whence 


T = A[N(q—-a@)«]+a(AN)?(1+2Kx)/(ANa). (117.19) 


Finally, consider the conicozd, i.e. a surface of revolution whose 
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meridional section is a conic section. We take its equation in the 


oo (a—x)?+ bu = a’, (117.20) 


so that s = 2[a—(a®—bu)?]. (117.21) 
It then follows without difficulty that 
us—4s = al[(1+y/b)?—1], (117.22) 
which is to be inserted into (117.12). 
Quite generally one may write s as a power series: 
S(u) = syut+sgu®t.... (117.23) 


The resulting series on the left of (117.15) may be inverted itera- 
tively, and the result substituted in the power series for us—s. 
One thus obtains 


us— 43s = 3[sy1 pr —szts? — (s7°s3— 457758) ¥?]+O(8); (117.24) 
and this is all one requires as long as one does not intend to proceed 
beyond the fifth order. 


118. Paraxial precalculation 


Paraxial rays may be traced through K by an elementary applica- 
tion of the laws of refraction (2.1). At the jth surface we take local 
axes X,,y;,2; a8 before, origin at A,, and it will do no harm here 
to denote the coordinates of the point of incidence P, of the ray by 
these symbols, i.e. to omit the subscript P. To the first order, (2.1) 


reduces to 
AN;e; =S p;AN;, (118.1) 
which gives e; in terms of e;, granted that p,; is known: 

Py = (1, — Cj V5» — 65%), (118.2) 
where c;(=1/r;) is the curvature of Y;. Henceforth (except at the 
end of Section 120) we consider only spherical surfaces, but even 
when ¥, is not spherical, (118.2) remains valid if c; be understood 


to be the curvature of Y, at A,. To proceed to the next surface we 
only require the relations. 


Cyr = jy Vir = Yi +4; (118.3) 


where d; is the axial separation between ,,, and Y. (The symbols 
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d, and dj, will evidently not occur, so that confusion with the 
symbols d and d’ in equations (14.27-30) is not likely.) 
Recall the variables o and yu defined, for arbitrary rays, by (24.2): 


6 =2,—sBj, w= B,—mBi. (118.4) 


s and m are, as always, the reduced magnifications associated with 
the pupil planes, and the object and image planes, respectively. ‘The 
values of these constants are supposed known. If meridional 
paraxial rays through O, and E be traced through K, then /,/f;, 
has the value m in the first case, and s in the second. 

Now, on account of the linearity of the relations (118.1) and 
(118.3), there must exist paraxial constants Ug;, Upyj, Ugj, Vj SUCH that 
paraay B; = Vg;F4+%jR, BZ = VQ;O+U,jM- (118.5) 
These constants are easily calculated as follows. Assigning to 7 
alternatively the values 1 and &, one obtains two identities involving 
only 8, and B;. Inspection shows that 


Vg = —m/[(s—m), Va, = —1/(s—m), 

Up, = 5/(s—m), Up, = I/(s—m). (118.6) 
The ratios v,,/v,, and v/v; have just the values which /,//; 
takes for rays through O, on the one hand and through E on the 
other. Therefore, if these points are located at x, = J, and x, =p 


respectively, trace two meridional (paraxial) rays, the starting data 
of which are 


(i) a-ray: yqy = mby[(s—m), dq, = —m|(s—m), (118.7) 
(ii) b-ray: yy, =—sp[(s—m), %=s/(s—m). (118.8) 
The value of f; in the trace of the a-ray is then just v,;, the value of 
8; is v,;, whilst the trace of the b-ray similarly allows one simply to 
read off the values of v,, and v,,;. Again, the values of y,; given by the 


a-ray and b-ray define paraxial constants y,; and y,;, such that for 


any paraxial ray Yj =Vaj F+Yoj (118.9) 


The notation quite naturally extends itself to any other variables 
which are linearly related to y and 8. One such case is represented 


by the quantity i=cy+8 (118.10) 


18 BIT 


274 HAMILTONIAN OPTICS 


which is of frequent occurrence. If the a-trace and b-trace are so 
set out that i, regarded as an auxiliary variable, appears explicitly in 
them, then the constants 2,,, 7,; may simply be read off, and then 


1, = 14; 0 +1; (118.11) 
We remark in passing that upon using (118.10) in (118.1) the latter 
eee ANi =0, (118.12) 


which is a paraxial expression of the invariance of the vector Ne x p 
under refraction. 

It may be useful to comment upon an apparent inconsistency in 
the notation introduced above, represented by the use of the symbols 
Uqj) --. tather than £,,,... on the right of (118.5). The point is that 
the right-hand member of the first of these, for instance, obviously 
can be thought of as giving at the same time the ‘paraxial’ value of 
any variable which reduces to B, in the paraxial limit. We shall 
indeed have occasion later to introduce variables V(= V,W) 


defined as V =B/x, (118.13) 
and then we shall have 
Vj = Uqj3 T+ Upj Ve (118.14) 


in the paraxial limit. If we define four variables S(= S,, S,) and 
M(=M,, M,) through 


S=V,-sV;, M=V,-—mV;, (118.15) 
we shall still have Vj = UqjS + %j;M (118.16) 


in the paraxial limit, since S then coincides with o, and M with p. 
Under some circumstances it is desirable to relate the paraxial 

b-constants to the corresponding a-constants in such a way that the 

b-ray enters into the various relations solely through the ratio 


0 = 4,/t,. (118.17) 
To this end recall that, omitting the surface index, 
N(Ya% —YoUa) = N1(Yai%1 — Yor Vai)» (118.18) 


with a similar relation in which on the left all quantities are primed. 
Because of (118.7-8) the right-hand member becomes 


— N,sm(s—m)~*(p —1)) = — dsm(s—m)~* = (s—m)~" fix, (118.19) 
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where (14.28) has been used, f, denoting the focal length of K asa 
whole. With (118.10) the left-hand member of (118.18) may be 
wiietds Nr(i,,0;—i,09) = Nri,(0,—60,). 
If we put ¢ = fx/[(s—m) Nri,], (118.20) 
we finally arrive at the relation 

| U, = Gu,+ od. (118.21) 


(118.21) yields a valid result if one simply attaches primes to both 
V, and v,, since AO = o and Ad = o. One also has 


YW = Oy, —-7d. (118.22) 
Once all the 6-constants have been formally eliminated in this way 


one can simplify the notation further by omitting the subscript a 
from the a-constants, without risk of confusion. 


119. The quasi-invariants A and A* 


Let T be the angle characteristic of the jth surface Y. The index 
j is again omitted for the time being, since only a particular, though 
arbitrarily selected, surface is in question. (This convention will 
generally be adhered to, except where special emphasis requires 
the restoration of this index.) The base-points B and B’ shall be 
mutually conjugate, but otherwise unrestricted. Further, let v and 
v' be the paraxial constants—exactly analogous to ,, vj, or to vs, 
vp-— defined by a paraxial ray which passes through B, and therefore 
also through B’. Then, contemplate the quantity A defined by 


A = —NvH, (119.1) 


where H,, H, are the coordinates of the point of intersection of a 
(finite) ray @ with the anterior base-plane #. The change in A 
consequent upon refraction of Z by S will be written as 


g = AA = —A(NoH). (119.2) 


Observe now that in the limit when @ itself becomes paraxial, 
A reduces to a joint paraxial invariant A of the kind defined by 
(10.12). Under these circumstances g must vanish, according to 
(10.13). We therefore call A a quasi-invariant, this nomenclature 
being intended to apply to any quantity, associated with a ray, 
18-2 
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which vanishes in the paraxial limit. In short, A has the valuable 
property that AA = 0(3), (119.3) 


i.e. the dominant terms of g are not paraxial but of order three. 
In view of (6.2) we have quite generally 


g= V(r) (119.4) 


The expression on the right may be evaluated in general terms, by 
using the explicit expression for T which, taking now the case of a 
spherical surface, is given by (117.17). Differentiating it, one finds 
with little labour that 


g = «r(AN)(1 + 2Kx)4 [o(a’B/a— B’) + v'(aB'/a’ — B)] 
+1N(q—7) oB/a—N'(q'—1)v'B'/a’]. (119.5) 
Now, since the paraxial ray through which wv is defined passes 
through the point x = q, we have gu = —y. In view of (118.10) 
we therefore have N(g—7)v = —Nri, (119.6) 
and in exactly the same way, 
N'(q' -1)v’ = —N'n’ = —Nn, (119.7) 
where (118.12) has been used on the right. Also - 


rNN’ 
krAN = N’_N =f, (119.8) 


where f stands for the mean focal length (in the sense of equation 
(14.13)) of Y, regarded as an optical system in its own right. If 
we now go over at the same time to the variables defined by (118.13), 
(119.5) becomes 


g = [f(r +2Kx)-3 (v'a — va’) + Nri] AV. (119.9) 
With regard to the first-order term of g, this is (fAv+ Nri) AV, 


since, as regards dominant terms, we need not distinguish between 
AV and AB. However, fAv = fAz = — Nri, on account of (118.12) 
and (119.8), and so the term in question vanishes, confirming what 
we already know. 

_ It will be seen that g factorizes, a result which appears in an 
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equivalent form in Section 14 of OAC. This factorization leads to 
considerable simplifications in the computation of higher-order 
coefficients (see OAC, Section 84), but unfortunately no such de- 
composition of g is possible when the surface is aspherical (cf. 
O (60.3), and equations (120.10), (120.22) below). 

It is of some importance to investigate also the properties of the 


uasi-invarian 
q an A* = aA. (119.10) 


The equations corresponding to (119.2) and (119.4) are 


g* = AA* = —A(NvH*) = V (v= 55) ; (119.11) 


where H* = eH. Proceeding as before it turns out that 


g* = e[—(1+2Ky)4 (aP’—a’B)+(8’—B)], (119.12) 
where e= Nn. (119.13) 


Here B and @’ now appear as the ‘natural’ variables. g* moreover 
has again decomposed into two factors, of which one relates only 
to the paraxial ray, whereas the second does not contain it at all; cf. 
O (50.21). Accordingly we write 


g* = ew. (119.14) 
However, as before, this decomposition is peculiar to spherical 
surfaces alone. 


120. The third-order aberrations 


Before we go on to questions relating to higher-order aberrations 
it may be as well to derive expressions for the five third-order 
coefficients from which their numerical values may be computed; 
bearing in mind their outstanding importance in practice. Two 
possibilities arise, namely, we focus our attention either on the 
third-order aberration function ¢@ as such, or else directly on the 
third-order displacement €3. In view of the trivial nature of the 
relations (23.2) between the effective and the characteristic coeffi- 
cients it is largely a matter of indifference which approach we 
choose. We shall, indeed, adopt the second of them for reasons to 
be explained in a moment; and it will be more convenient to post- 
pone certain remarks concerning the first method until such time 
as we have arrived at the desired results. 
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From (119.2) we have at once 


ke 
Ay, =A,+% zg; (120.1) 
j=1 


since A; =A,,,; granted, of course, that the same paraxial ray 
serves to define q;, gj, Uj, V; for all values of 7. Let this ray now be 
taken specifically to be the a-ray of (118.7), the corresponding angle 
characteristic being 7,. Then (120.1) becomes 


ro I = 
H = Neo, tt Se S g}- (120.2) 


kVakj=1 
Now let H;, and H, stand for the reduced coordinates of O’ and O 
respectively—all other lengths remaining unreduced. Then 


e’ = H,-—mH, 
is exactly the displacement hitherto denoted by this symbol, and, 


bearing in mind the second member of (118.6), (120.2) at once 
reduces to 


k 
c= (¢—m) 8) (120.3) 


an equation which, of course, involves no approximations. 

On the right we must now use (119.9) for the individual con- 
tributions g,. The latter are still functions of the intermediate vari- 
ables V,, V;, and these must be eliminated in favour of S and M. 
As far as the third-order displacement is concerned, we evidently 
only require the dominant terms of (119.9) for we already know these 
to be of the third order. Consequently, if in them the intermediate 
variables be expressed in terms of the paraxial relations, the error 
so committed is at least of the fifth order, and is therefore irrelevant 
in the present context. The task before us is therefore a very simple 
one. , 

Since vy = O(2), we have from (119.1) 


g = f[—KxAv, + (dak — va8)] AV + O(5), (120.4) 

where, of course, &, 7, € refer to. G, e.g. £ = P+ 7. Using (65.12) 
this becomes 

g = 4fl—«(k—1)i,(E-27 + 0) + (v.E— 0,0) AB+O(5), (120.5) 

AV having been replaced by AB, which is permissible because the 

difference between these expressions is O(3). ‘The constant k is 

defined as k= NIN’. (120.6) 
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Let us take the various terms in turn. We have 
AB = (Avg) 5+ (Avy) ps = (Atg) + (At) uw. 
= (k—1)u(o+ Op), (120.7) 
the convention introduced at the end of Section 118 being under- 
stood. In just the same way 


£—29+€ = (Af)?+ (Ay)? = (R—1)27(E +209 + 62C). (120.8) 
Again, 
OgE —viE =0,05(Ad,) +2040; (Ae,) 9 + (0205? — 07,08) 6, 
Upon using (118.21), this becomes quite easily 
Ugh — Val = (R—1)i[vv'(E + 209 +0°C)— GE]. (120.9) 
(120.7-9) are now to be inserted into (120.5). When we remember 
that k = kh(k—1)-* and f = — Nr(k—1)"", we thus find that 


g = 4Nr(k—1)2?[(a’ — vv’) (+ 209 + 676) + $C] (o+ Ou). (120.10) 


Next we note that r(i’ — vv’) = y(2'+ 2), (120.11) 
whilst N(R —1) 7726? = (s—m)f% 6, (120.12) 
where & = cA(1/N). (120.13) 


At this point we set fg = 1 as we usually did throughout our earlier 
work. It is quite natural to introduce the abbreviation 


p = 4£N(R—1)27y(2' +0), (120.14) 
and then (120.10) reads 
& = [p(E +209 + O7C) + 3(s—m)* OC] (o+Ou). (120.15) 


Recalling the meaning of the effective coefficients, together with 
the relations 6 = y,+O(3), % = y,+ O(3) which appear just after 
equations (24.8), we have finally 


pi = (s—m) Xp, 

pz = 2(s—m) LOp, 

ps = (s—m) D[O*p + 3(s—m)?], 
pf = (s—m) Lp, 

PF = 2(s—m) X6*», 

ps = (s—m) DO[6’p + 3(s —m)* 6]. 


The summations of course go over all the surfaces of the system. 


(120.16) 


280 HAMILTONIAN OPTICS 
We may also remind ourselves that the Seidel coefficients 6, ..., 05 
in (19.7) are given by 
= p* = 1p* —_ 1p* = pe — p¥ 
O,=fpt, C2.=3P3, %3=3P2, C1 =PF—3P2, %s = ps. 
(120.17) 


To the present order the complications which arise from the 
asphericity of refracting surfaces are minor, and we proceed to 
investigate them. Since the equation of a spherical surface is 
s = cu+4}cu? + O(6), we write that of an aspherical surface as 


s=cut+}c(1+a)u?+O0(6), (120.18) 


where the constant a is a measure of the asphericity near the pole, 
whilst c is now the axzal curvature of Y, i.e. the curvature at its 
pole. In (117.23) we now have 


Sp=C, Sy = te%(1+a), (120.19) 
and (117.24) may then be written conveniently as 
us —4s = (us — 4s), —4ray* + O(6), (120.20) 


the first term on the right representing that part of the complete 
expression which remains when a = o. Now yis defined by (117.15), 
and upon expanding it in powers of £, 7, ¢ it turns out that 


y = (AN)? [(AN A)? + (AN y)?] + O(4). (120.21) 


If we denote by D(g) the terms which have to be added to the right- 
hand member of (120.5), we have, in view of (117.12) and (119.4) 


—}(AN)aryV (o3) +O(5) 


— gra(AN)-3 (ANv) (ANB) [((AN£)? + (AN y)?] + O(5). 
However, ANv=cyAN, ANB =cypAN+O(3), 


D(g) 


and therefore 


D(g) = —30a(AN) yy p(yp+ 2p) +O(5). (120.22) 


On account of the particular form of this result it is pointless to use 
(118.22) here. It is better to set 


G = y/¥as (120.23) 
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introducing the abbreviation 


c=—}ayAN (120.24) 
at the same time. Then 


D(g) = (§ + 284 + 6£) (6+ Gp) + O(5), (120.25) 


and then the primary effective coefficients are given by (120.16) if 
one merely makes the formal replacements 


Op > Op+6%c (s = 0,1, 2,3) (120.26) 
throughout. 


121. The third-order aberration function 


We write 


1) = pO? + pedo +psbSt pay’ t+ psyS+peS%, — (121.1) 


as was done after equation (44.7), i.e. the factor (s—m)-! on the 
right of (19.24) is omitted. Then, referring to (23.2), we know that 


PY = 4(s—m) py, PF =2s—m)p2, PF = oe ey 
bf (s—m) Po, ps i 2(s—m) Pa, py = (s—m) ps. 


By inspection of (120.16) we see that the integrability condition 
(23.8), which to this order requires 2p/*—p} = 0, is satisfied, as 
it must be. Comparison of (121.2) with (120.16) then shows that 


Pi=t2p, P2a=DOp, ps= oe (121.3) 
Ps= Xp, Ps = LOlO*p + 3(s—m)-? Gi]. 


These, then, are the expressions required for the calculation of the 
coefficients of 2, p, excepted. Their structure is evidently exceed- 
ingly simple; and this is still true when the surface is aspherical, 
a situation which is allowed for by the formal replacements (120.26). 
In Section 125 we shall return to the ‘missing’ coefficient pg. 

The route we have pursued to arrive at (121.3) seems somewhat 
roundabout at first sight. In principle we start with the angle charac- 
teristic, use its derivatives to compute the displacement, and then 
return to the angle characteristic by integration (for that is what ‘the 
comparison of (121.2) with (120.16)’ amounts to). We therefore 
naturally ask ourselves whether, to the required order, we might not 
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have eliminated the redundant variables directly from ¢;. Here an 


apparent complication arises. Thus, write 
T, = TO+TP +t, (121.4) 


where 7 is a constant, and 7? stands for the terms of 7; linear in 
£,7, ¢. Then, recalling (44.1), we have 


t= > (TP +t;)—C/2m. (121.5) 
j=1 


It is from the sum which appears here that the redundant variables 
have to be removed; but now, upon using the appropriate relations 
re By = vj 0-+ 0952+ O(3), (121.6) 
it seems at first sight that the terms of order three in them will 
cause JT) to contribute third- (and higher-) order terms to ¢;. In 
other words, it appears that one cannot simply take 2 = 1?) and 
use the paraxial relations to eliminate the intermediate variables. 
Happily it turns out that, as far as % (but not 7®), ...) is concerned, 
one can do what has just been described after all. That this 1s so will 
be shown in Section 124, for by that time we shall have established 
certain incidental results which will enable us to elaborate upon 
the point in question in the detail which its importance demands. 


122. The absolute invariant «, 


In Chapter 5D we studied at length the so-called invariant and 
semi-invariant aberrations of symmetric systems. We found that 
there was only one such thzrd-order invariant, namely «3, and it was 
absolute, i.e. independent of both s and m. According to (50.3) 


Ay = 2(s—m)* (2p3—Py). (122.1) 
However, from (121.3) we know that 
Ps = 3207p +Oc+Hs—m)?G], py = X(*p+ Pe), 
from which it follows that a3, = )6. (122.2) 


The sum on the right is called the Petzval Sum. Since it depends 
solely on the axial curvatures of the surfaces and the refractive 
indices of the various media, it is obviously independent of s and m, 
as it must be. 
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123. Intermediate variables as functions of the coordinates 


If one wishes to go beyond the third order the paraxial relations 
between the intermediate variables and the coordinates are no 
longer adequate. The question therefore arises as to the exact rela- 
tions which subsist between them, taking, for the time being, the 
V, and V; as intermediate variables, and S and M as ‘coordinates’. 
To answer it, we begin by simultaneously contemplating two 
angle characteristics J, and 7, which differ from each other with 
respect to the choice of base-points. J, is the function considered 
hitherto, i.e. 7, has base-points Op, Oo, and the individual parts 
T,; of course relate to the base-points Og;, Oo;; so that, for instance, 
qj in (117.17) is —y,;/V,; as before. J, on the other hand refers to the 
base-points E, E’, and 7, to E,, Ej; and in this case the constant 
q; is to be taken as — y,,/v,;. The two functions are simply related 


to each other: 
Tog — Tag = ALN (G0 — Ga) 1). (123.1) 
Using (118.18-19) this becomes 
Ty = Tj => (s = m)* fre A(a/v, Up)j- (123.2 ) 


If one sums this over the whole system, and takes the definitions 
of s and m into account, one gets 


T,— T, = —(s—m) f(%j— %/sm) = —d’a;,—da,, (123.3) 

a result which can be seen to hold from first principles, bearing in 

mind the meaning of d and d’ in the standard diagram represented 
by Fig. 4.1. 

The various definitions and formulae of Section 119 are now to 


be thought of as duplicated. In particular, in (119.4) v, goes with 
T,, and vw, with 7, i.e. 


oT, oT, 
8.=V (voz) B= V (32) : (123.4) 
whilst in place of (119.2), 
8. = AA, = —A(Nv,H), 8 = AA, = —A(Wu, Hz), (123-5) 


where, in accordance with previous definitions, H and H, are 
respectively the coordinates of the points of intersection of # with 
the normal planes through O,,; and E;. 
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The following notation is well adapted to the problem in hand. 
We write jot | i 
G,;; = 2 Sai G,.; = ba Sai (123.6) 

t= t=j 


with analogous definitions when the subscript a is replaced by 5. 

When the symbols on the left are primed, the summations go from 

1 toj and fromj +1 tok respectively, and the sum is to be regarded 
as zero when the upper limit of the summation is less than the lower. 

(If one wishes to omit the subscripts 7, the semicolon and colon 

must of course be retained.) We also set 


Gus +Gars = 3 Ba = Ge (123.7) 
In view of (119.2) one can now write down the somewhat trivial 
set of four equations 
N1%1H, — Nj0.;H; = Ga; p 
N1%1Hg1 — Nj%;Ha; = G,; 5 
Nj %ajHj —Ni,VaxHy, = Gays 
N;%;Hx; — N70. = Go. ;- (123.8) 
From the first and second pairs of these one finds without difficulty 
that 5G,.;-+mGp.3 = —Nj(0jSH, + 0H p,)—fee Vir 
G,.;+ Gy.3 = Nj(0a;H,; + %;Ha;) +f Vi- (123.9) 


Now multiply the second of these alternatively by s and m and then 
add it to the first. Setting fz = 1 as often before, one then gets 
N;%,(s —m) Hy; =§ + (sG, + mG, ae sG, 5), 
(123.10) 
Nj @a;(s —m) Hi; = —M = (sG,;; +mG,.; +mG,). 


Upon multiplying the first throughout by v,; and the second by z,; 
and mutually subtracting the resulting equations, there comes 
Brally V; = Uq;(S+5,;) + %(M +5,,,), (123.11) 
where 

3.5 = mG; ; +sG,.; + sG,, Sind = sG,,; + mGa:; + mG, (123. 12) 


Clearly V; is given by an expression just like (123.11), the only 
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formal change being the appearance of primes attached to the par- 
axial coefficients and to the functions 5,,, 5,,;. The latter are in turn 
given by expressions like (123.12), but with primes attached to all 
functions which have an index j, new ranges of summation being 
so implied, in accordance with the remark following equations 
(123.6). 

For quantities linearly related to V; and H, one can write down 
convenient equations exactly analogous to (123.11), the only change 
being in the paraxial coefficients. For example, if Y; denotes the 
coordinates of the point of intersection of Z with the polar tangent 
plane of %, we have 


Y=H-4V, (123.13) 
where J, = —y,/vq. The second member of (123.10), together with 
(123.11), then gives 

¥5 = Va(S +555) +95j(M +8 nj), (123.14) 


(118.18-19) having been used to deal with the various paraxial 
quantities. Paraxially there is of course no distinction between Y and 
y(= yp). 

It may be noted that in the course of computation one may obtain 
the so-called increments 8,; and 5,,; by referring them back to 6, 
and 6,,,,. Thus, if one simply writes 


G =G,+G,, (123.15) 

one has from (123.12), by inspection, | 
§,,=s5G, 6, = mG. (123.16) 
Since AS, = —(s—m)g,, A6&,, = (s—m)g,, (123-17) 


5,0) 543) «-+) Sng, --- are then obtained successively, starting from the 
first surface. 

So far we have contemplated in this section the variables V,; and 
V; and those which are naturally associated with them, such as 
Y,, H;, etc. We could equally well have focused our attention on 
8; and ®;; and then, of course, Y¥ (= «;Y,) must appear where Y; 
appeared before, whilst H} similarly replaces H;, and so on. The 
formal modifications of the various equations are of a trivial nature, 
and one typical example will suffice to show what is involved. Thus, 
in place of (123.11) we shall have 


B; = Uqj(6+8,;) + (Ut 8,5), (123.18) 
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where §,; = mG5. ; + sG5.;+sGz, \ 


(123.19) 
8, = sGi;; +mGij +mG#,| 


j-1 
with Gi = DD) 2%, | (123.20) 
1.= 


and so on. 

It is rather striking that one can also write down equations for 
the intermediate variables in terms of the (reduced) ray-coordinates 
y’ (= N,Hz,,) and y, (= N,mH,), appropriate to the point charac- 
teristic, which closely correspond to (123.11), but are in fact for- 
mally somewhat simpler than these. Thus the first and fourth 
members of (123.8) read 

Nj %;H; = mar Yi = Gasp] 
| (123.21) 


Nj%;Hp; = ony + Gp.;. 
Now 


Nv, %)(Hg—H) = N(ya%—Yo%a) V = (s—m) "fr V, (123-22) 
because of (118.19). (123.21) then gives straight away the relation 


FV = Vagly’ + (S—m) Gy. 5] + ly + (8—™) Gazi] (123-23) 
This result is far from being of merely academic interest, for the 
following reason. We shall see later how T may be computed, and, 
if it is to be a characteristic function in the proper sense of the term, 
it must be exhibited as a function of the appropriate variables, 
say o and w. The present method, however, has inherently the kind 
of flexibility to allow just as easily the computation of T—regarded 
merely as an optical distance—as a function of y’ and y,; and in the 
course of this (123.23) will play a crucial part. On the other hand, 
we have the general relation 


V = T-d'a'+B'.y’—B.y,/m (123.24) 


(where the second term on the right represents the movement of the 
posterior base-point from Oj to E’) so that the knowledge of T(y’, y;) 
implies that of V(y’, y,). The difficulties outlined in Section 117, 
at any rate, evidently do not arise here, because the pornt character- 
istics of the individual surfaces are never contemplated. 
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124. On the direct computation of the third-order 
characteristic coefficients 


We return to the general question of the direct computation of the 
coefficients of ¢, meaning thereby any method which does not go 
through the displacement e’. In particular, we shall consider the 
special case of ¢@) in detail, in view of the apparent difficulty which 
we encountered at the end of Section 121. To deal with the general 
principles involved, let 
k sj = 

i= 7A T(E 50» 65) (124.1) 
be given, the base-points O,;, Oo; being understood. Now, for any 
ray through K, H,,, = Hj, ie. 


Ta 
+37 =0 or a 3 124.2 
Bj. OB; Y ae 


However, 8;,, = B;, so that, in view of (124.1), we have the k—1 


relations 3 
—— =O p= Deiaar he 124. 


These evidently express the fact that T is stationary with respect to 
small variations of the intermediate variables: which, the initial 
and final rays being fixed, is simply a direct reflection of Fermat’s 
Principle. Moreover, they constitute a set of 2k — 2 equations, which 
allow us, in principle, to express By, ..., 8, as functions of B, and 
8; or what comes to the same thing, of o and p; and the exact 


solution of these equations must of necessity just have the form 
(121.6). We thus have 


By = %qj6+%jR+B,, (124.4) 
where the B; are certain expressions which contain no linear 
terms: 

B; = O(3). (124.5) 


T, as given by (124.1), is as yet a function of o, p and By, Bs, ..., 8,3 
and it is convenient to write it simply as 


T = J(o, 2; Bo, ..., By) = J(6,u; B), (124.6) 
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so that ® stands here for the whole set of intermediate variables. 
(124.4) is now to be inserted into (124.6). There comes 


T= T(o, w) = J(G, U5 Uy o+%%+B) 


aJ (124.7) 
= J(a, Bi Ugo+ OU) + > (a 8, B, ita, ~, Ba) +8 


where 6J involves second and higher derivatives of J. (All de- 
rivatives are evaluated at B = 0.) Now, according to (124.3) the 
first derivatives on the right of (124.7) vanish. Further, all terms of 
éJ are at least quadraticin the B,, so that in view of (124.5), J = O(6). 


It follows that = 7, _ I(G, pr, Vq 6 +02) + O(6). (124.8) 


We therefore conclude that, upon substituting in (124.1) for the 
GB; their paraxial expressions in terms of o and p.,, the resulting func- 
tion differs from the correct angle characteristic by terms which are 
of degree not less than six. On the other hand, the substitution of 
linear expressions in T$ leaves the latter linear in &, 7, ¢, so that 
we are indeed justified in taking 


k es = 
09 = BP En ln i) (124.9) 
j= 


and simply using (118.5) to eliminate the intermediate variables. 
In short, the effect of the B,; induced by )}7* in ¢ is nugatory. 

This conclusion, though elementary in character, is so striking 
that it will not come amiss to confirm it by detailed calculation, in 
the light of the equations of Section 123. This may be done as 
follows. From (117.17) and (119.8) 


TP = MfE-29+ H-N'(—NE+M—NEl, (124-10) 


the index j being suppressed on the right. By means of (119.6—7) 
we easily convince ourselves that the factors multiplying 4 and $¢ 
are fu,/v, and fv,/v, respectively, so that (124.10) becomes 


TP) = $ fata A(B/2a)} + [A(y/2.)]3- (124.11) 
Now, from (123.18), 
A(B/v_) = pAd+ A(S, + 45,) = pAG+D, (124.12) 


say, where we have set 6 = u/Vq- (124.13) 
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Accordingly (124.11) becomes 
TP Epis &) = fra [G(AB)2+2(AB)y.D+D.D], (124.14) 


in a notation suggested just after equation (14.4). Summing over 
j from 1 to k, we evaluate the various terms in turn. To begin with, 


foq0ghO = fb(1 — Rj ig = (s—m), (124.15) 
by (118.20-21) and (119.8). Hence 
¥foav4(AB)® = (s—m)-*(Gy—O,) = 1/m, (124.16) 


since 6; = —1 and @, = —s/m, according to (118.6). As regards 
the second term on the right of (124.14) we have, with (124.15), 


(s—m)~* p.. 3D = (s—m)* (Di, —D)). 


In view of (123.16) D, vanishes. Again, 6, = 81% = sG,+mG,, 
and so D;,, also vanishes. Thus we have finally 


VTP EM; g) = Clam+ 32 fea%aD .D. (124.17) 


Here (121.5) may be recalled: we see that there are indeed no terms 
of the fourth degree on the right of (124.17) at all. To this extent 
we have therefore confirmed explicitly the result we got earlier in 
this section. Here, however, we have more, in as far as the second 
term on the right of (124.17) is an explicit expression for the con- 
tribution by }}7% to the terms of the sixth and higher degrees of 
T; see Section 130. 

Before going on to consider the question of higher-order co- 
efficients it will be wise to return briefly to the results of Section 
121, with particular reference to the coefficient p, which was not 
included in (121.3). 


125. The coefficient p,. Incidental remarks concerning 
duality 


To begin with, consider the determination of p, by the direct 
method described in the preceding section. From (117.17) one 
has at once . 


T = T+ re(AN) (x — 3Kx") — 3N'(lp—1) (6 + 38?) 
+4N(, —1r) (€ +462) + O(6), (125.1) 


19 BIT 
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the index j having been suppressed throughout. Since 
x = 4E—29 +0) + aE 6)? + O(6), (125.2) 
we then get from (125.1) 


t®) = ENr{(k—1)-*[K(E—29 +6) -(E-C)"] 
+%q[(87/vq) —(G7/va)}}- (425.3) 
In principle it only remains to make the usual paraxial substitutions 
for B and @’. In this way one does indeed recover (121.3) exactly. 
On the other hand it appears that the labour involved in this pro- 
cedure exceeds that required to go through the steps which lead 
from (120.5) to (120.15). This is largely because, whereas the expres- 
sion (125.3) is quartic in the intermediate variables, (120.5) is 
only cubic; whilst this cubic has, at the same time, a linear factor. 
These simplifying features, taken together, probably make the 
method of Section 120 much the most convenient as far as the third- 
order aberrations are concerned. It is deficient only in as far as it 
does not provide us with the value of the coefficient p, which multi- 
plies £2 in 2, a coefficient required both in the context of object 

shifts and of the exact higher-order displacement. 
To remedy this situation it suffices to make the appropriate 

paraxial substitutions, taking = 7 = 0, i.e. one sets 


Evy, F>ymb, Cos (125.4) 
using (118.21) at the same time. Then 
By = ENe{(k~1)*[A(R— 1) 044 — (Av? + 2g 8A0))] 
+iA[(60+ ¢)/0}}, (125.5) 


the subscript a on the paraxial coefficients being understood, as in 
(120.14). The factors multiplying the various powers of 6 may be 
simplified in the usual way. Upon finally summing over 7 one gets 
the desired result, 


Pe = FLO) + (s—m)? OO — 3 g?/00)). (125.6) 


It is instructive to consider (125.6) from another point of view. 
We have already introduced J, alongside 7, as a device designed 
towards being able to write down formally explicit equations such 
as (123.11), representing the solution of (124.3). The relation 
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between 7, and T,, or equivalently between 7, and T,,, is very simple, 
and given the one, the other need not be calculated separately. 
At any rate, supplying the coefficients of the power series for 7, and 
T, with additional subscripts 6 and a respectively, a moment’s 
reflection shows that p,.; must be the same function of the paraxial 
b-coefficients as p,1, is of the a-coefficients. Thus, recalling (120.14) 
and the first member of (121.3), 


Pog = N(R —1) 0 Yo (t +%)- (125.7) 


Writing z, = 62,, etc., as usual, and leaving the index a understood, 
(125.7) becomes 


Poe = 20% — 36°(s—m)* (Av®) —49%(s—m)-20, (125.8) 
On the other hand, selecting the quadratic terms of (123.1), we 


have PAP = Ao— m2 (Boh) oe). (125.9) 
The substitutions (125.4) in this at once give, with (118.19), 
Pas— Pos = 3(s —m)~* A(v}/eq) 
= £(s—m) 1 [PA(v?) + 3029GAv + GA(1/v)]. (125.10) 


Upon combining this with (125.8) and summing over /, one exactly 
recovers (125.6). 

One easily convinces oneself that if # be obtained by going 
through the displacement after the fashion of Section 120—one is 
then considering the displacement associated with the pupil planes 
—it iS py,, and not pyg, which remains undetermined. In this 
sense, therefore, p,, can be obtained by dealing with displacements 
alone. This conclusion involves no contradiction with what was 
said earlier, since we have now been contemplating displacements 
relating to two distinct pairs of conjugate planes, and the respective 
coefficients are related to each other by equations given essentially 
by (45.9): and they involve p,, that is to say, fog, in their right- 
hand members. . 

In practice one may well decide to calculate the coefficients of 
both 7, and % to whatever order may be required. In the first 
place one then has an excellent check upon one’s calculations, in as 
far as one’s results must conform with (123.3). Further, thinking in 
terms of digital computers, the set of instructions which have to be 


Ig-2 
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fed into the machine need virtually be no greater than if 7, alone 
were being considered, provided one does not make use of relations 
such as (118.21); that is to say, the use of @ is to be abandoned. In 
that case the same set of instructions will yield p,,, (@ = 1, 2, 3, 4, 5, 6) 
on the one hand and p,,,(@ = 6, 5, 3, 4, 2, 1) on the other; only the 
numerical input data differing in the two cases. It is true that there 
will be redundancy in the output, but—numerical checks quite 
apart—it is very advantageous to have the various a- and b-quantities 
available independently of each other at the intermediate stages 
of higher-order calculations. In conclusion it should be mentioned 
that the possibility of using one set of instructions to compute two 
distinct sets of coefficients is the result of a formally ‘symmetric’ 
construction of the whole theory with respect to a-quantities and 
b-quantities: at the most rudimentary level even the initial a- and 
b-rays are calculated according to the same programme. At any rate, 
the situation just described is exactly that dealt with at length 
under the heading of duality in Section 6 of OAC XII, the details 
of which may be translated into the present context without undue 
difficulty. 


126. The idea of iteration 


The time has come to contemplate the computation of coefficients 
of higher order. Here we can do little more than to adumbrate 
the general principles involved, though enough detail will be pro- 
vided so that, not forgetting the availability of the very explicit 
details of the Lagrangian method, it should not be too difficult to 
translate theory into practice. 

For the purpose of discussing the modus operandi of the pro- 
cess of iteration about to be described, we concern ourselves for the 
present with the displacement, regarded as a function of S and M. 
Suppose, then, tha’ at each surface the quantities g,; and g,,, as 
defined by (119.9), have been written down as power series in the 
‘local’ variables V, and V;: 


Boj = B+ Ost.» Boy = Bot Boyt. (126.1) 


where g” and g\” are each homogeneous of degree 2n—1. With 
the usual paraxial substitutions one then gets at once g® and 
g?) as functions of S and M, since the non-paraxial terms of (123.11) 
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are irrelevant to the dominant terms of the g,. Correctly to the third 
order, the increments 6,, and §,,;, as given by (123.12), follow 
immediately by mere summation. Indeed, as suggested just after 
equation (123.15) we obtain them most easily from 


op = s > (gai-+ gh?) —(s—m) $y gt}, 
(126.2) 
Bi = m >; (B+ ah) +6—m) 5 82, 


the first sum on the right, which is independent of 7, being the same 
in each case. In view of (123.11) we now have at this stage expres- 
sions for V; and V; in terms of S and M which are correct to the 
third order, e.g. 


Vj = Vaj(S + BY) + ep j(M + 8i7}) + O(5). (126.3) 


Now return to (126.1), and let g, stand alternatively for g,, and g,,. 
Substitute the expression (126.3) for V; and the corresponding 
expression for V; in g,, rejecting all terms of degree greater than five. 
This means that every g%” which has n > 3 is to be ignored, whilst 
in g® one may content oneself with the usual paraxial substitu- 
tions. As regards g?) one recovers the previous third-order terms, 
but in addition they will contain fifth-order terms which arise from 
the known third-order expressions for 59), ...,89. Altogether we 
now have g;, correctly to the fifth order. In the same way as before, 
simple summation immediately yields the increments correctly 
to the fifth order, so that V; and V; are now known as functions of 
S and M with an error which is O(7). These are now used in (126.1); 
and upon selecting all the terms of degree seven one has, when these 
are taken together with the lower-order terms already determined, 
g, correctly to the seventh order. It should be obvious by now how 
one can proceed. systematically, step by step, to whatever order 
desired; and this process is called zteration. Note that if e’ is required 
to order 27-1, g{” need not be calculated, since according to 


(120.3) kom 
=(s—m) > X g)+ O(an+1), (126.4) 
j=le=1 


and in this only the g{§) with s < m are required in the course of 
iteration. 
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Suppose, then, that e’ has been obtained, to the required order, 
as a function of S and M. As far as the displacement is concerned 
the presence of these variables, rather than of oand yp, does no harm, 
for in the paraxial approximation S and M have the same signifi- 
cance as o and p, and we may simply consider the pseudo-displace- 
ment now to be defined with respect to S and M. In any event, one 
has for the variables y,(= Hy,) and y,(= mH,), which occur in the 
series (23.1) defining the effective coefficients, the expressions 


ye =S4+8,, yr =M+65,,1, (126.5) 


and here everything is known to the required order. The effective 
coefficients are therefore obtainable in a manner entirely analogous 
to that set out in Section 24. 

If the angle characteristic is to be found from e’ by integration 
one must first eliminate S and M in favour of o and w. Trivial 
though this task may be in principle, it is somewhat cumbersome 
and unattractive in practice, and we would like to avoid it. In other 
words, we should proceed in terms of o and p., and therefore in terms 
of g* rather than g,; from the outset, as we shall indeed do in 
Sections 127-9. The general process of iteration is, of course, 
exactly the same as before, in view of the generic form of equations 
(123.18) and (123.20). On the other hand, one now unfortunately 
ends up with e*’(= H#’ — mH), and this is not the displacement. 
Rather than discuss this point now, it is preferable to come back to 
it in Section 130, where we resume the investigation of the whole 
problem of computing the coefficients of T. 


127. Modified increments. The variables o, and pu, 


As soon as one attempts actually to iterate in the manner described 
in the preceding section one becomes aware of a most irksome 
complication. It is a consequence of the fact that the increments 
appropriate to 8, differ from those appropriate to ®;. Were the 
increments the same, as they are in O(9.3) or O(12.9), any series in 
powers of B,, 8; could be rewritten exactly as a series in powers of o 
and uw by first using the paraxial relations and then replacing o 
by o+6,, and p by 2+5,,;, such a two-step procedure being very 
convenient in practice; cf. Sections 11, 81 and 84 of OAC. Here, 
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however, one could at best use, in the first step just described, 
relations of the kind 


By = Ugg FH sR, Bj = Vqs+ jh, (127.1) 


the additional accents being intended to keep track of the fact that 
in the resulting ‘“pseudo-expansion’ one must eventually replace 
s by o+5,,, but 6 by 6+ &,,, and so on. Thus, already in the con- 
text of the third-order terms of gj, say, one is temporarily concerned 
with forty instead of the usual six coefficients, and likewise in fifth 
order with 220 instead of the usual twelve. This is obviously a very 
unhappy situation and we must seek to avoid it. This may, indeed, 
be done as follows. . 
We demand that 8; and @; be given by expressions of the form 


B; = (6 -+d,;) +(e + ret 
B; = wo+d,;)+%,(u+d,,), 


d_,, appearing in both equations. 


(127.2) 
the same modified increments d,;, 
Then we must have 

Ud, +d, = 046,45), Uqd, +d, = %5,+%5,. (127-3) 


These equations are easily solved for d, and d_,, and it turns out that 
the solution may be written compactly as 


d, =5,+%5, d, =5,—,85, (127.4) 
where 8 = (04% —Uq_%) 1 (Vg A5, + v A5,) 
= (s—m) (04% — Vat) (% Ba — Va Bd): (127.5) 


It may be remarked that the same device may be employed in the 
context of equation (123.23). When the surface is spherical 


gb = Og7, 
as we know from (119.12); and then (127.5) reduces to 
§ = (s—m) 6 w, (127.6) 


in view of (119.14). The expression on the right makes no explicit 
reference to the a- or b-rays whatsoever. 
A moment’s reflection will show that the device just described 
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is exactly equivalent to introducing in place of 8; and @; local vari- 
ables o,; and p,,, defined as 


8; = —(U 45 %pj — Vaj Mj) (oj By — Pj BY) 


ee ; (127-7) 
By = (Gag P25 — Paz 05) * (Cag By — Paz Bs), 

a step already suggested by the form of (124.11). Note that o and 
w retain their previous significance. Using (127.2), there comes 


The formally extremely simple structure of these equations implies 
a corresponding degree of formal simplification with regard to itera- 
tion. This may be seen as follows. Given, as before, some series in 
ascending powers of 8, and B;, we put 


By = Ugg 8; +Upj Pj, By = Vay Oj + Vos Vy, (127.9) 
according to (127.7). The corresponding pseudo-expansion arises 
by merely omitting from every o,; and p, the index 7. The correct 
series is restored by subsequently replacing o by o+d,, and p 
by u+d,;. Of course, it must not be thought that the labour in- 
volved in obtaining the pseudo-expansion has been reduced in this 
way, for (127.9) represents, in effect, just the usual paraxial sub- 
stitutions which lead to it. 

Two small points are worthy of remark at this stage. ‘The first 
concerns equation (119.4). Because of (127.7) we can now write 


oT, 01; 
S46 =a, By =x, (127.10) 
a 0G; of Op.; 


but these seem to have no virtue other than formal elegance. More- 
over, they revolve about g,, whereas there are no corresponding 
equations for the more important quantities g¥. The second point 
relates to the meaning of the expression y,(o+d,)+),(u+d,). 
It cannot stand for either Y* or Y*’ on account of its symmetry with 
respect to primed and unprimed variables, and we suspect that it 
is closely related to the coordinates of the point of incidence, y. 
Indeed, using (127.7—8), one finds after a certain amount of manipu- 
lation involving (117.4) and (118.18—-19) that for a spherical surface 


y(o+d,)+y(utd,) = (o/AN)y, (127.11) 


which confirms our expectations. 
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128. Third- and fifth-order pseudo-coefficients 


It may not be entirely out of place to set out in somewhat greater 
detail those steps of the iterative process which are required to 
obtain fifth-order coefficients. This may serve as a guide towards 
what has to be done in practice when proceeding to higher orders, 
for which reason the notation will on the whole be already adapted 
to the general case. Again, it will suffice to consider only spherical 
surfaces explicitly, to avoid a number of complications which may 
well be regarded as irrelevant to the immediate purpose in hand. 
Nevertheless, here also the details of the notation will be so arranged 
as to allow readily for the presence of aspherical surfaces. We 
operate with the variables 8 and ®’, bearing in mind that the formal 
changes required to go over to V and V’ instead are of an elementary 
nature; and they revolve mainly about the detailed form of (128.14) 
and (128.17). This situation is to a large extent the result of the 
flexibility of the whole method. 

At every surface we must first expand w, as defined by (119.14) 
in ascending powers of B and @’, rejecting all terms of degree 
greater than five. We therefore require only w® and w®: 


we = 3[«(E—279+6) +E] 8’ —d[x(E—29+6)+E)B, (128.1) 
w = 30? +«(E—C)?— ax(E — 29 + £) — 32 E— 279+ £)"] 8" 
— BLE? + «(G —£)? — ax(E — 29) +) E—30°(E —27) + €)7] B. 

(128.2) 

The remark following equation (124.4) should here be recalled. 
Next we use (127.9) in (128.1-2), and, upon omitting the sub- 
script j from o, and p.,, we obtain the corresponding terms of the 
pseudo-expansion of w, as explained after equation (127.9). We 
henceforth distinguish pseudo-expansions, and individual terms 


thereof, by angular brackets. Thus, quite generally, restoring the 
index 7 now to avoid confusion 


(wi) = TW) = TD (wepjo+ wpe) eremA CH. (128.3) 


The coefficients which appear on the right, and more generally 
those in pseudo-expansions of any other quantity, will be called 
pseudo-coefficients; and they will always be distinguished by the use 
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of german type. In practice, to avoid having constantly to write 
down the superscripts (7) and the double subscripts f, it is useful 
to use a special notation for the lower orders, say for n = 2, 3 and 4 
at least. Thus, 

(wy) = (PigS + Pag +'PajS) 6+ (BryS + Pos7 +'Paj6)u, (128.4) 

Kwi)) = (81588 +... +'86;6?) 0+ (By? +... Bg SB, (128.5) 
and so on. 

The notation for <g*), i.e. (g*;) and <g#;), is entirely similar. 
Thus <g¥*) is represented by a series just like that in (128.3), 
except that 9%, g% replace m7), and to{7), respectively. Strictly 
speaking we should write 9*‘# instead of g{%¥;, and so on, since 
g%%; is naturally associated with (g;). However, it is best at this 
point not to encumber the notation too much: one simply has to 
remember that an asterisk is missing. Further, (7%; stands for 
g,; or g§,; according as (gz;> or (g};) is being contemplated. 
Then, for spherical surfaces, 

O95 = Caj ro %,, O52 85 >= oj rw!%,, (128.6) 
and similarly for the barred coefficients. These equations are not 
valid, of course, if they refer to (g,> rather than (g#); nor do they 
apply to aspherical surfaces, for then there are additional terms on 
the right. . 

For the lower orders one will again use a notation like that of 
equations (128.4, 5), i.e. 


(oF) = (PyS+...) O+(PyE+-..) a, (128.7) 
and so on. (Here also every coefficient should have an additional 
asterisk.) Our immediate task is now to obtain expressions for the 
pseudo-coefficients of orders three and five in a form resembling, if 
possible, the primary coefficients of Section 120. 

It is advisable to be as systematic as possible, to which end we 
introduce three auxiliary quantities defined as follows: 


Pp = E4+20jn+ GE, OQ; = 24,4 4;6), Ry = $56. (128.8) 
Then, for example, € = v2;P;+,;0;+ R;. (128.9) 


The index 7 may now be safely omitted; and at the same time the 
index a may also be left understood. ‘Thus 


K(E—297+€) =a'P, (128.10) 
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and then there comes from (128.1) 
(w> = H{[(a’ +?) v' — (a + v2) v] P+ (Av) R} (0+ Op) 
—4$¢[(Av?) P+(Av) Q]p. (128.11) 


Upon multiplying throughout by e(= e,) one naturally finds that 
the factor multiplying Pe is just p as defined by (120.14). Also set 


= }(s—m)" (Av), ‘6 = (s—m)6, (128.12) 
and then 
(g*)) = (pP+2'6S)(o+ Op) —(aP+3'G$Q) pm, (128.13) 
whence, recalling (128.7, 8), the required coefficients are given by 
Yi =P, P2 = 20p, 3 = Pp+ FG, 
Pi =%pi—a, f,=Op,—20a—'d, fi; =Op,—Pa—O'G. 
(128.14) 
The coefficients of (g#® follow from these by multiplying each 
of them by @. It must not be forgotten that f; has been given the 
value unity throughout. (If fg + 1 one has to include factors f, and 
fix respectively on the right of the equations (128.12) for a and ‘6, 
since every ), must be dimensionally a length.) 
We go on in like manner to the fifth-order coefficients. From 
(128.2) we get in the first place 
8(w®) = ((v?P+vO+ RP +a'[(v’ +0) P+ QP 
—2u’P(v®P+vQ + R) — 3272"2P?} [v'(o + On) + dp] 
—{(v?P+v'QO+ RP +2'[(v' +v) P+ OP 
— 2it'P(v"P + v'O + R) — 3274"2P*} [vo(o+ Op) + dp]. 
(128.15) 
Multiplying throughout by }e, one obtains for (g*® an expression 
of the form 
(B*) = (8P? + 529 PO + 539 “PR + 4549 *O? + 35 6 4R?) 
x (6+ Op) +(3,PPt 282 P*PQ+3sh PR + t34h °Q? 
+33s$-OR)p. (128.16) 
To evaluate the 4, and 3, one essentially needs only to read off the 
factors multiplying P?, PQ, ... in (128.15). Still, this is a somewhat 
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tedious business, and it will suffice to quote the results: 
a1 = a(R? —5k+1)?+3(R—1)t0+ 30"]p, 
32 = (v' +) dp, 
33 = dba— 3 Ot’, 
ba = PP, 
35 = 30"0, 
51 = 332 —4(k—1)?27a, 
32 = Pp — POVe?, 


(128.17) 


33 aioe $670, 
34 = 233 
35 = — 485: 


Returning now to &, 7, ¢ by using (128.8) in (128.16), there comes 
finally 
$1 = 3p 


8, = 463, + 32, 

8, = 2073, + 032+ ds) 

8, = 4073) + 2032+ da» 

85 = 4093, + 367 52+ 20(33 + 34); 


By = 043 + P50 + F(a + 84) + 355 

7 (128.18) 
$,= 63, +31 

So = 08+ 4031+ 3a, 

$3 = 085+ 207%, + 039+ 33, 

8, = 084+ 40%, + 2030+ 34; 

8, = 08, +4093, + 307%. + 20(53 + 3a); 

$5 = 0844643, + F730 + F7(5s + 3a) + G35: 


We note in passing that the particular structure of equations such 
as (128.14) in which the same quantities tend to occur over and over 
again is no ‘accident’. Thus, contemplate the quantity N(Yy—Z/) 
which was shown to be an optical invariant on general grounds in 
Section 8. We write it here as Na! ( Y*y — Z*), and the fact that 
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this has the same value in the image space as in the object space 
evidently implies identities between the coefficients which occur in 
the expansions of the increments. On the other hand we may also 


use the identity A[Na-1(Y*y—Z*)] = 0 (128.19) 
at any particular surface. If we use (123.1 8), etc., in this, and restrict 


ourselves, for the sake of illustration, to the third order we get at 
once the relation 


(Sy AB ye — 7 A3 yy) — (Hy Abra Heb) 
= (Fy, — Fs My) [(a'/a)—1] +06). (128.20) 
Here AS, = —(s—m)e,w and A& yz = (S—m) eqw, so that if we write 
WwW = wo+mp (128.21) 


for the moment, there comes 
Ww) = Ow) — 1 bA( 6? 4+ y?). (128.22) 
Then, upon multiplying throughout by e,, one reads off the relations 


Py i Op, — hq pAvi, De = OP, — eq PpAryr%, Ds = Ops— he, pAv;, 
(128.23) 


and these will be seen at once to be in complete harmony with the 
last three members of (128.14). 


129. The fifth-order iteration equations 


Having considered the various third- and fifth-order pseudo- 
coefficients in reasonable detail, it remains to bea little more explicit 
with regard to the iterative equations which yield the coefficients 
of the exact series. It will be recalled that the latter result from the 
pseudo-expansions by replacing in these o by o+d,; and wp by 
v.+d_,; throughout. Consequently it is most convenient from several 
points of view to denote the coefficients of the exact series by the 
same kernel-letters as those of the corresponding pseudo-ex- 
pansions, except that italic type replaces german type. For example, 


wi = Z werwipymeryPee, (129.2) 
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and w?) = (‘py E+...) o+(PyEt-.-) Bs; (129.2) 


and so on. Finally it is desirable to introduce a further set of 
coefficients of an auxiliary character, namely those which occur in 
the expansions of d,, and d,,;. We naturally write d{%),, and dip, 
for these, together with their barred counterparts. For the lower 
orders one will of course write p,1;, .... P,3j, and so on. (It should be 
borne in mind that strictly speaking every coefficient relating to d, 
and d,, should have an additional asterisk; and the latter will then 
be absent when one is dealing with d, and d,,.) 

With our system of notation complete, we can now write down 
the identity 


>» 2 [ai%?,(o +d,,) + G2(u t+ d,)] *E"-% *y2-F #CP 
= 2 2 (esis o+ BY, p.) E22 FCP, (129.3) 


where *€ = £+20.d,,+d,;.d,,, 
*y = 4 +(o.d,,+u.d,;)+d,;.4,;, (129.4) 
#0 = C4 2p.d;+d,;.d,,;. 
(129.3) simply reflects the idea of the pseudo-expansion, and both 
its members equally represent g*, that is to say, g%, or g},, if every 
coefficient is given an additional index a or b respectively. 
The modified increments are now to be written as power series. 


It is best to do so in two steps, first writing down the series for 
*£, *, *€ to the required order. Thus one has, for example, 


*E = $+ 2[Dorjb? + (Dory +Pooj) 8 + Posy $+ Poog I” 
+ Posj9S]+O(6), (129.5) 


with analogous equations for *7 and *¢. Selecting now the terms of 
degree 3 in (129.3) one has, of course, 


Pa = Pas Pa = B. (a = 1,2, 3); (129.6) 


where the index/ has been left understood, since in equations of this 
kind every coefficient has this index. Next, selecting the terms of 
degree five, we find the following fifth-order iteration equations: 
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Sy = 314371 Port (Pi tie) Pa 
5, = 81 +P. Port Bil By + Por) + PePpr» 
S. = 8 +)1(2P51 + 3D o2) + Pi Duo t+ PalPyat 2Por +P,,2) 
+ (Pet 23) Pu 
5, = 82+ Pi Poot Pil2por + Py2t 2Po2) + Pe Por 
+ Pol Por + 2Py1 + Py2) + 2H3 Pav 
Sg = 93+ 3P1 Post 91 Pst Pol Poi tP,s) 
+)3(2P 1 + Por) + Ps Pur 
53 = $+ P1 Pog t+ Pi(2p,3+P,s) + Bol Bor +P,3) 
+P3Poit 3P3 Pp 
Sq = 84+ 2) Pos +Po(2Po2+ Pus) + (Po + 20s) Pas 
54 = 84+ (2P1+Pe) Poot Po(2P,2+ Poa) +203 Pass 
S5 = 85+ 2P1 Pos tPo( Poet 2Pos + Piz) + PePus 
+P9(2h,2+ Poet 2Pys) + D3 Pua 
55 = 85+ (21+ Pe) Bos + Pol Pos + Post 2P,s) 
+3 Poot Pa(3Py2+ 2Pus)> 
Sg = 86+ De Pog + Pa(2P,3 + Poa) + Bs Pa» 
55 = 36+ (fo+Ps) Pog + 3Ps Pys- 


(129.7) 


The equations are quite general, in the sense that no use has been 
made of any relations which may exist between the coefficients 
which occur on the right. They are valid whether the surface is 
spherical or not. When it is, they have to be used only once, i.e. for 
ga\” (every pseudo-coefficient having then the additional index a) 
since gj = 6g*®); otherwise they are used twice, once for the 
coefficients of g?® and once for those of g¥®). As regards the 
use of digital computers, one requires, however, only one set of 
instructions. One can presumably do even better than that, since 
six of the fifth-order equations are the duals of the remaining six, 
in the sense explained at the end of Section 125. Thus s, goes into 
5¢) 5 into sg, and so on, upon mutually interchanging the indices 
1 and 3, and also the indices o and yu of the primary coefficients, at 
the same time replacing barred by unbarred symbols and vice versa; 
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whilst the duals of the fifth-order coefficients 3,, 3), ... are 3g, 3g, --- 
as already set out in Section 125. 

Comparison of (129.7) with O(11.3) shows that the one may 
be obtained from the other by straightforward transcription accord- 
ing to the scheme 


Sy >Sy, $y >38, (@=1,...,6); A>Py, BP, CPs; 


(129.8) 
‘Ay > Puy ‘By >Py» ‘Cp Pass Gea 
‘A =a —Pov ‘B, ome — Por ‘Cy as — Poss 


and analogously for the barred coefficients. Therefore, if one 
wishes to go to the seventh order, one does not need to work out 
the required iteration equations ab initio. Instead, one can directly 
use O(81.3), with the appropriate transcription of the symbols 
which occur in it. 

All the third- and fifth-order coefficients p,,),,5,,5, can 
now be calculated from (129.6—-7), together with (128.14) and 
(128.17-18); and for the sake of convenience we recall here 
the connection between the p,,, -.-;P,q on the one hand and the 
Par Pa (& = I,2,3) on the other. This is quite generally provided 
by equations (123.19) and (127.5). For spherical surfaces in 
particular, 


ke j= 
d,; = 2 + 9,) gf —(s—m) DBE + OF" M4 $i 83) 
ie = (129.10) 
dij =m & (244) 88+ (5m) & BFGF 2005 $587, 


where g* stands for g*, throughout. Then, for example, in view of 
(129.6), 
k es 
Poy = S X (1 +6;)py,—(s —m) 2 iPr +°O7 yj PjP1yz, (129-11) 


and so on, in an obvious way, for the remaining coefficients. 

The form of (129.11) is characteristic of the comparative lack 
of neatness one gets in Hamiltonian calculations. In the correspond- 
ing relation of the Lagrangian method only the middle term (sup- 
plied with the appropriate constant factor) remains. 
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130. On the computation of the higher-order character- 
istic coefficients. Fifth-order pseudo-coefficients and 
iteration equations 


We resume the discussion concerning the problem of the coefficients 
of T where we left off at the end of Section 126. As we know, in 
principle two broad alternatives present themselves from the point 
of view of practice, namely, either T is computed by integration of 
the displacement, or else directly. With regard to the first of these, 
there are again several alternatives, and we first deal with these in 
turn. 

If, on account of (120.3), one proceeds exclusively in terms of 
g;, €’ will appear as a function of the ‘wrong’ variables S and M, 
and these must be eliminated in favour of o and p before we can 
integrate. As already explained at the end of Section 126, this is an 
unattractive proposition, and we do not contemplate it. If, on the 
other hand, we deal with g*¥ alone, we end up with e*’ and this is not 
the displacement. This defect can be remedied, for example, as 
follows. In the course of iteration one makes use of g# as well as of 
gi, so that we also have 


k 
ef’ = a}, Hi, —sa, Hy, = —(s—m) = Sti = —(s—m)GF (130.1) 
j= 


available to a certain order. We can write 
es = &'H’ —soH+(s—m)m-p, 


where the subscripts k and 1 have been suppressed, without risk 
of confusion. Using this relation, we readily convince ourselves 
that, identically, 
é = (5 -3) Gi + (5 -;) (mG¥ + wp). (130.2) 
Now, restricting ourselves again to the fifth order, suppose that 
G* is known to this order: we have already seen in detail how to 
calculate it. In so doing we shall have obtained G¥ correctly to the 
third order, and this is all that is required, since (1/«’) —(1/«) = O(2). 
Still, even here the factors arising from a and @’ are not as simple 
as one might like them to be, since they have to be written as 


20 BIT 
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series in powers of £, 7, 6, and not of & and €; but at least we now 
have e,; and e; as functions of o and p, as desired. 

So far we have explored only procedures based on either g to 
the exclusion of g* or vice versa. A little reflection shows that 
perhaps we should compromise and use both at the same time. In 
the context of fifth-order aberrations in particular one then has a 
very simple state of affairs which may be described as follows. One 
first proceeds as set out in Sections 128 and 129, but only to the 
third order, the coefficients of d,; and d,; being included in this 
third-order computation. Now let g,(= g,;)—-not g/—be expanded 
in powers of 8; and 8,. The third-order coefficients are already 
contained in (120.5), but those of the fifth order must be found 
ab initio from (119.9). The usual paraxial substitutions then lead 
to (g» and (g®)), the first of which is just the right-hand member 
of (120.15). To get the correct expression for g?) we now have to 
replace o by o+d,, and p by w+d_, in (g?). To this end equa- 
tions (129.7) may be used as they stand, provided we understand 
the coefficients on the left, and all pseudo-coefficients on the right 
to refer to g, rather than g¥; whilst the coefficients p,,, ...,p,3 are 
already known. In view of (120.3) we thus obtain e’, correct to the 
fifth order, as a function of the desired variables o and wu. 

We note the following point in passing. G has the generic form 


G = Go+Gu, (130.3) 
say, where G and G are functions of , 4, ¢. (24.4) therefore entails 
that the integrability condition 

20G/0f = éG/éy, (130.4) 
must be satisfied. Thus, if, consistently with (123.7), we temporarily 
write 5)p1,; = P,, and so on, there is one third-order condition: 

2P, = Py, (130.5) 

and three fifth-order conditions: 
45, = Se, Sp=S,, 253 = S;. (130.6) 
Having calculated the various coefficients independently of each 


other, one’s confidence in the numerical results will be greatly 
enhanced if the identities (130.6) are in fact satisfied. 
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Finally, we must consider the direct calculation of T. For this 
purpose one may again use to advantage a kind of hybrid method: 
‘hybrid’, that is to say, only in the very limited sense that for 
practical convenience g¥ and T; are separately expanded in powers 
of B’ and B. With regard to 7, one has in the first place, from (121.4), 


c= z [TY + T$(B;, B;) + t,(B;, B,)]- (130.7) 


The first sum on the right is a constant, which is, in fact, the optical 
distance between O,, and O;. We henceforth bring this constant 
over to the left and absorb it in 7. The usual paraxial substitutions 
then yield the pseudo-expansion 


T= TKTP@wy+E(owy (1308) 


It should be carefully noted that the meaning of T$” and t,, regarded 
as functional symbols, is not the same now as it was in (130.7). The 
exact equation which is generated by the pseudo-expansion 
(130.8) is 


k 
T= Py [T$(o+d,,,u+ dj) +t(o+4,;%+4d,;)]- (130.9) 
Using (127.2) in (124.12), one sees at once that D = d,,A0, and since 


according to (124.15), fU,v,A0 = (s—m)-1 (when fx = 1), we can 
write (124.17) in the form 


ke 
x TH(o+d,; n+d,;) = C/am+ > bid: dj, (130.10) 
j=1 j= 
where b; = 4(s —m)A0,. 
For the aberration function we thus finally get the equation 
k 
t(é, UB ¢) = a [t,(o + d,j, U AK dj) + b,d_,;. d, i). (130.1 1) 
3 = 


Let us suppose once again that only the third- and fifth-order 
aberration functions are to be found. The first of these, i.e. 2@), was 
already obtained previously. At any rate, we write 


tp = CEP) = Pg 8? + Dag EI + «+P 956? (130.12) 


20-2 
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The somewhat overworked coefficients on the right are to be re- 
garded as defined by this equation. They are evidently given by 
(121.3) and (125.6), the summation signs being understood to have 
been omitted from the right-hand members of these six equations. 
We also require 


EO) = 816° +8969 +... +3196? (130.13) 


the index 7 having been suppressed throughout. To get the coeffi- 
cients 8, we must first find the terms of degree six of 7. There is a 
considerable resemblance between these and (125.1): 


164(E,9, 6) = fl(E —€)?(E + €) —K(S — 29 + £) (E—£)? 
+K9G—29+0)+Nri, (=), (130.24) 
all variables referring to the surface in question. Using (128.8), this 
becomes 
16¢¢9)(E, 9, £)> = f{(PAv? + QAv)* [P(Vo? — 22") + QVv + 2R] 
+x—l(i')? P?} + NriA[(v?P+vQ+R)3/v]. (130.15) 


From here we proceed as after (128.15). The factors multiplying 
OR? and Q? turn out to be zero. Set 


a1 = SP[(R?— 3k + 1)2? + 3(k — 1) 20+ 30°], 
32 = 2A(Y' +2), 


gg = g(v'2—vv’ + v?)'d, 


(130.16) 
35 = ?*a, 
io = ob", 
37 = 330) 


ae = —teP*(G/vv’), 


where p is given by (120.14), and a and ‘@ by (128.12). The 3, are 
of course quite distinct from those which appear in (128.17), 
though they are distantly related to them. Then 
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$1 = bn 

8, = 6031 +32, 

3, = 3073, + 032+ 335 

8, = 12674, + 4050+ 34, 

35 = 12053, + 66740 + 20( 233 +34) +45 


Bg = 3043 + 2073 + O7(235 + 84) + 935+ 36, 
8, = 8693, + 40730 + 2644, 
8, = 12043, + 8095, + O7(443 + 584) + 2985 + 375 
By = 60°31 + 50%50 + 49°( 3+ 34) + 307354 20(S6 + 2)» 
B19 = 993, + P59 + O4( 53+ 54) + O55 + O7( 56 + 7) + 38: (130.17) 
It remains to write down the iteration equations for the co- 
efficients s, of the contribution by the jth surface to the fifth-order 
aberration function 7¢®(&,7,€), ic. to the terms of degree six of 
the expression within the square brackets on the right of equa- 
tion (130.11). They are, with b = }(s—m)~"fx, 
Sy = 3. +4P1Po1 + PoP pt Opin, 
So = 82+ 4)4(Poit Po2) + Po(3Po1 + Py +Prp2) 
+ 2(P3 +4) pit 26P (Pia +Py2)s 
S3 = 83+ 4)1Po3+Po(Po1t+ Pus) + 20s(Por + Pur) 
+ PsP O(2PaPyst+Pyr)s 
Sq = 84+ 4D1 Poe +Po(2Po1 + 3Po2 t+ Pye) + 2P3P 2+ 2Pa(Por 
+ Prat Py) t+ 2PsPat (Piet 2P Pot 2PrPya)s 
$5 = $5 +4D1Po3t+Pa(Po2 t+ 3Pos + Pus) + 2Pa(Por + Poot Pye + Pra) 
+ 2N4(Po1 + Pys) +P5(Port 3Pya + Pye) + 4P oP ut 
+ 20(D,2Pist+PurPy3t+PaPpstPiaPj2)s 
56 = 86 +Padas + 2Ps(Pos t+ Pus) +Ps(Port+ Pps) 
+4P6Pat O(Piigt 2P iP us)s 
Sy = $2 + 2PoPo2+ 2Pa(Poa + Pus) + 2PsPp2t 2bPu2P p25 
Sg = 83+ Popes t+ 2PsPo2+t 2Pa(Doa+ Pos + Pps) +Ps(Po2t 3P ye 
+ 2Py3) + 4P6Pu2 + 5(2Pp2Pp3 + 2PyaP 12 + Pina)» 
Sg = $9 + 2()3+P4) Pos +¥5(Po2t Pos + 3Pys) 
+ 4P6(P,2+Pys) + 20(Py2+Pus) Pras 
Sig = 819 + PsPos + 4PoPys t+ Ops: (130.18) 
The operation of the duality principle is clearly visible. 
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The coefficients s,;(«% = 1,...,10) may now be calculated from 
these equations for each of the surfaces, for all the third- and fifth- 
order pseudo-coefficients are known from equations (121.3) and 
(125.6) (without the signs of summation) and from (130.16-17); 
whilst the coefficients of the third-order increments are known from 
Section 129 (see, in particular, (129.11)). The coefficients of £® 
finally follow by mere summation over the whole system: 


k 
= D Sup (130.19) 
I= 


We note that the results embodied in the lengthy equations 
(128.17-18) and (129.7) were not used in obtaining the s, in the 
manner just described. To this extent they are indeed redundant. 
They have, however, been included since they will be required as 
soon as one wishes to use the present method to calculate the 
coefficients of the seventh-order aberration function. 


Problems 


P.12(i). Find the shape of the most general surface which has a 
pair of (non-coincident) conjugate points B, B’ entirely free from 
spherical aberration. (Such a surface is a ‘Cartesian ovoid’.) 


P.12(ii). Show that the surface of the preceding question can be 
a sphere, and find the locations of B and B’ when this is the case. 


P.12 (iii). Show by considering the angle characteristic that the 
conjugate points of the sphere defined in the two preceding prob- 
lems are such that rays through them satisfy the sine-condition. 
(Remark: The sphere thus has a pair of non-coincident conjugate 
planes such that not only spherical aberration but also circular 
coma of all orders is absent. The axial points of these planes are 
therefore called the aplanatic points of the sphere; cf. Section 33(1).) 


P.x2(iv). By inspection of the third-order aberrations show that 
no surface other than a sphere possesses a pair of aplanatic points. 


P.12(v). Obtain an expression for g®(V’, V) and hence show that 
the coefficient of S(.S? + S?)? in (g®) is 


—8p[(k? —k+1)1274+3(Rk—1) v4 30". 
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P.12 (vi). Show that (with fg; = 1) 
2g*p = ‘@(iz' —vv’), 
and hence that in (130.16) 
1635+ 444 = ‘G(R? -Rk+1)2?. 


P.12 (vii). Writing 5,=00,+ud,, etc., show from equation 
(127.19) that . 


95+ 895 = Sua + 901 — 31(85 + V5) — (Fi + YD] + O(4)- 


P.12 (viii). A set of variables x,, x9, ..., x, is related to a variable x 
through the set of equations 


J 
Ky = AX+ % Fil) (j= I,2, any) 


where Fj(2j) = Gag Xt} + Ogg 98 + Ay Xi +... 


The various a’s are given constants. Develop in detail an iterative 
method of solving these equations, i.e. for finding the x; as power 
series in x. Proceed at least to the seventh degree. 
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Chapter 1 
P.x(i). We have dx/dt = 2xt, dy/dt = yx-t, dz/dt = zx-*. These 
derivatives are proportional to the direction cosines a, f, y of the 
tangent to the ray at the point x, y, z, and so are the derivatives of F, 
if F(x, y, 2) = constant is the equation of the normal surface through 
the point in question. There therefore must exist a function @ such 


oat dF = «-*0(2ndx+ydy+zd2), (P.1(i), 1) 


granted that the given congruence of rays is normal. By inspection 
6 = x? is such a function. The required normal surfaces therefore 
Have Me equalOn: ce) a 38 = iGonst.: (P.1(i), 2) 
and they are evidently ellipsoids of revolution. 
P.1 (ii). By the argument of the preceding solution, there must 
exist a function @ such that 

oF oF oF 

Doe = da, by = Of, ass = by, 


and these require that 


Oy) 0B) 0x) _ Oy) _ AOR) (0a) _ 
oy dz "és Ox "Ox oy 


Multiply these equations by a, #, y respectively and add them. The 
required necessary condition follows at once. 


P.x (iii). By inspection, taking N = 1, 
V*(APA’) = [(a—hy?)? + (b—y)?}* + [(a— hy?) + (b+ 9)" P. 
Expanding this in powers of y’, 
V* = 2(a? + b?)t + a(a® + b)-? [a — 2(a? + b*) k] y? + O(y'), 
(P.x (iii), 1) 


from which the conclusion stated in (a) follows by inspection. 
[ 312 ] 
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Next if the reflecting surface is such that V is constant, its 
equation must be 


[(a—x)2-+(b—y)*]8 + [(a—x)*+ (6+ 9) *]F = 2(a2 + BP. 
From this one obtains by repeated squaring the required equation 


Cae 
a a* +b? 


I, 


which is that of an ellipse with A and A’ as foci. It may be written as 
x = Hal(a?+02)]y2-+ O(y4). 
Thus, near the origin one has a parabola for which & has just the 


value which makes the coefficient of y? in (P.1 (iii), 1) zero. 


P.1 (iv). One may think of the medium as stratified, the equation 
of the boundary of any layer being N = constant. Then, if neigh- 
bouring points on the ray define the displacement ds, 
d(Ne) = x grad Nds, 
since in (2.6) the normal p is in the direction of grad N, x being a 
scalar factor. Scalar multiplication throughout by e gives 
e.d(Ne) = Ne.de+e.edN = dN =kds.grad N = kdN, 


bearing in mind that e.e = 1. Hence x = 1, so that one has the 
required equation. 


Chapter 2 
P.2(i). The system is invariant under rotations about the X-axis, 


and also invariant under translations along the y- and Z-axes. Its 
point characteristic must therefore have the generic form 


V = f(x’, x, u), (P.2(i), 1) 


where u = (y’—y)?+(2’—2)?; see also Sections 13 and 84. By 
rotation and translation we can always arrange the coordinates of 
the initial point A and the final point A’ to become (x, 0,0) and 
(x’, ’, 0) respectively, relative to the new coordinates. Then 


V = f(x’, x, 9’). (P.2 (i), 2) 


In view of the elementary laws of refraction the ray through A and 
A’ will lie in the plane Z = 0, and we need to consider this alone. 
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Then if P(a, yo, ©) is the point of incidence 
V*(APA’) = N'(PA’)+ NAP) 
= N'[(x' —a)?+ (9 —yo)*}t + N[(a— x)? + ye}? (P.2(i), 3) 
This must be stationary with respect to small variations of yo, and 
so Ar , Ar a 
Ryo[(a — x)? +8} 4 = (9 — yo) I(x’ — 2)? + (9 — yo)", 

with k = N/N’. This is a quartic equation for yy which may be 
written as ss : 

aad (98+ A). 9’) By = 0, (P.2(i),4) 
where A = (a—x)?/(1 —k*), B = k? (x’ —a)?/(1 —?). Let the appro- 
priate root of (P. 2(i), 4) be 

Yo = X(x", #9’). 


Then V is the function which is obtained when x(x’, x,%9’) is 
substituted for yy on the right of (P.2(i), 3) and 9” then replaced by 
[(y’ —y)?+(2’ —2)*]# in the function which results from the first 
substitution. 


P.2(ii). Because in this limit the system is telescopic, and the ray- 
coordinates f’, y’, 8, y can no longer be given values independently 
of each other. 


P.2(iii). Since the points B’, B have the coordinates (q’, 0, 0) and 
(g, 0,0) respectively, we have by an elementary geometrical argu- 


ment T = A{N[(q—xp)&—-(ypP +2py)]}}, (P.2 (iii), 1) 


where xp, yp, Zp are the coordinates of the point of incidence P, 
and, whatever X may be, AX = X’ —X. Note that 


N'(ypB'+2Py’), 
for instance, is the optical length of the part of the refracted ray, 
produced backwards, between the foot of the normal from the 


origin on to the ray and P. Now when the surface is spherical one 


has in (2.6) Deeps ey ees): (P.2 (iii), 2) 


where c = 1/r. Hence 


op.p =a = A(Ne.p) = A{N[a—c(xpat+ypfh+zpy)}}- 


SOLUTIONS OF THE PROBLEMS 315 


Using this on the right of (P.2 (iii), 1), there comes 
T = A[N(q—r)a]+ro. (P.2 (iii), 3) 


On the other hand, taking the scalar product of each member of 
(2.6) with itself, one has 


o® = [A(Ne)] .[A(Ne)] 
= N+ N?—2NN'e’.e 
= N’24 N?—2NN"(aa’+ BB’ +yy’) 
= N24 N®—2NN'[(1 —£)# (1 —0)# +9]. 
With this result, (P.2 (iii), 3) will be seen at once to be equivalent to 
(4.5). (See also equations (117.2-17).) 
P.2(iv). The required optical distance is, by inspection, 


W, = N'[x'a'+ (yf +2'y')—-Oph'+2py +N yph+2py). 


(P.2 (iv), 1) 

In view of (2.6), N’f’=N£, N’y'=Ny, (P.2(iv), 2) 
since p, = p, = Oo here, and so (P.2(iv), 1) reduces to 

W, = N'x'a' + N(y'P+3'y). (P.2(iv), 3) 


But N’q'2 = N’2—N'%(B'2+ y'2) = N'2—N%( 624-2), 
by (P.2(iv), 2), so that (P.2(iv), 3) is just the stated result. 

P.2(v). The coordinate axes having been suitably chosen, equations 
(7.2), taken together with (6.2), require that 


LL a eT 


for all rays. The first of these shows that # and f’ can occur in T 
only in the combination / — m/f’, and the second likewise shows that 
y and y’ can only occur in the combination y—my’. Hence the 


required result is T = 2(B—mp',y—my’'), (P.2(v), 2) 


where g is some function of two arguments. 
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Chapter 3 


P.3(i). In the notation of equation (117.2), the equation of the 
paraboloid is 


s = 2k-tut, (P.3 (i), 1) 

Then in (117.15) yy = u-?/4k, (P.3 (i), 2) 
whence in (117.12) 

ui—}s=—Yeht=—y gk. (P33) 


Equation (117.12) therefore becomes 
T = A(Nqa) —(1/4k) (ANa)? [(AN)? (1 + 2Kx) —(ANa)?}-3. 
(P. 3), 4) 
Since the expression in the square brackets vanishes when 
Paya pays 


it follows at once that J cannot be written as a power series in the 
ray-coordinates. 


P.3(ii). W has the property in question for example when it is the 
surface of an anchor-ring, i.e. a circular torus (see equation (93.1)). 
Such a surface can be thought of as generated by rotating a circle 
of radius R,, centre C, about a line # in the plane of the circle, where 
the perpendicular distance R, from C to & satisfies the inequality 
R, > R,. In the course of the rotation C itself generates a circle 
of radius R,. It is obvious that every normal to Y%, i.e. every ray, 
passes through both @ and %. 

Take one such ray, and suppose that the angle which it makes 
with F is not a right angle: this will be the case for every ray Z 
which does not happen to lie in the plane of @. Next contemplate 
a narrow bundle of rays, i.e. a general parabasal bundle, about @. 
These will cover small segments of both @ and #, and these seg- 
ments are indeed the parabasal focal lines in this case. The first 
of these is normal to &, but the second is not. 


P.3 (iii). By substitution from (10.3) one finds that 
Ay — Ay = b[H'2' — By’) + bi 2y" —Iz') 
+ be(Hz' — By") + by(Zy —Jz2). 
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A,— A, must reduce to a similar expression except that y and z will 
be interchanged throughout. Hence Aj,— A, = —(A,—A,), which is 
the required result. 


P.3(iv). In the first place one has to have b, = b, and then we must 
take d’ = —1/b,. Set y = pcos0, z = psin@ in (10.15). Eliminating 
0 one gets 


p°d"*(Byb, — Bb)? = (02 +b3)¥'2—2(bybg + byb,)¥'Z’ 
+ (63+ 63)Z".  (P.3(iv), 1) 
The discriminant of the quadratic form on the right is negative, and 


(P.3 (iv), 1) therefore represents an ellipse. The condition that it 
must not shrink to a point is 


b3b, — by bg + 0, (P.3 (iv), 2) 
which is just (10.4). 


P.3(v). The required equations come from the first member 
of (3.6): 
[ao + (31? + bay'a’ +... + tbi92%)]? + (yy + gy +b42)? 
+ (byy' +652’ + bey +b7z)* = 1, (P.3(v), 3) 
where a dot denotes differentiation with respect to x’. Hence the 


term independent of the ray-coordinates requires that dj = +1, 
and since on the axis V is an increasing function of x’, 


ay = x. (P.3(v), 2) 


By inspection of the quadratic terms of (P.3(v), 1) one then has the 
differential equations 


b,+b}=0, b,=0, b,+b,b,=0, 6,+b,b, =0, 
bs+b2 =0, bgtbsbg=0, b,+b,b, =0, 
bg t+ b2+b§= 0, bytbybytbgb, = 0, by +b2+b2 =0. 
(P.3(v), 3) 


Let the coordinates be so taken that the initial values of the various 
functions relate to x’ = 0; and let these values be distinguished by 
bars. Then, from the first of (P.3(v), 3), 


b,(x") = by/(1 +5, 2’). (P.3(v), 4) 
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The second equation is trivial, and the third gives 
b3(x") = Bs/(1 +1’), (P.3(v), 5) 

and one can easily continue in this way. 
P.3 (vi). Take the generators of the cylinder to be parallel to the 
z-axis. When the whole cylinder is displaced in this direction one 
ends up with the same optical system, and so one must have 

V(y', 2’ +4,y, 2+a) = Vy’, 2’, 9, 2), 
for any a. It follows that V can depend upon z and 2’ only in the 


combination z’ —z. Moreover, the positive z-direction is here not 
preferred over the negative z-direction, and so V must be an even 


function of 2’ —z: V =VIiy',9,(2’—2)%) (P.3(vi), 1) 


and this function can be written as a power series in the three 
variables which occur as arguments. (See also Section 84.) 


P.3(vii). Yes. There are five for a fixed position of the object, 
namely the coefficients of y’3, yy, y’y*, y'(2’ —2)?, W(2" — 2). 


Chapter 4 
P.4(i). Using (14.23), with $ replaced by 7, one has, in view of 
(14.29), the pair of equations 
mq’ = m(d'+smq), q = d'+mmq. 
These are very easily solved for g and q’, and recalling (14.27) one 
has at once oi 
g=o-mf a=(5-2)f£ Pan 

P.4(ii). From the basic equations B = — @V/éy = —y'V, —2yl, 
and @’ = aV /ay’ = 2y'V;+yV,. These can be written in the form 
(14.5), with 

A=~2ViV, B=—1/V,, C =(V3-4V,%),, D = -2¥h, 
and (14.7) follows from these by inspection. 


P.4 (iii). We argue exactly as in Section 15, except that since the 
ratios Y’/y and Z'/z need no longer be constant we have to write 


Y’ = mD(Oy, (P.4(iii), 1) 
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where D(o) = 1. The generic form of (P.4 (iii), 1) is required by the 
symmetry of K, and by the fact that Y’ must be independent of y’. 
Using the same rotational invariants as those which occur in (15.6), 
the required result follows at once. D({) is a ‘distortion function’ 
whose meaning is defined by (P.4 (iii), 1). 


P.4 (iv). All rays from an object point O unite in a single point 
O'(X", Y’, Z’) where, by hypothesis, 


X'=C), Y'=DOy,. (P.4(iv), 1) 


Here C and D are functions of ¢ such that C(o) = 0 and D(o) = 1. 
Using the usual argument, one gets 


V = g(6)—[(d'+CP+£—-2D7+D%}. (P.4(iv),2) 

This has the required form if one sets 

K = 2d'C + C? + €D?. (P.4(iv), 3) 
Then, writing C(¢) =¢,¢+..., D($)=1+4,64..., 

K(6) = A+... = (1+2d'e,) 4... 

so that c, = (k, —1)/2d’. (P.4(iv), 4) 
Again, from X’ = c,€+..., Y’2+Z’2 = €+... one has 

X’=¢,(Y"+Z)+.... 
The required curvature is therefore 2c, = (k, —1)/d’. 


P.4(v). By differentiation of (P.4(iv), 2) we have 


B’ = —(y’—Dy,)/R, (P.4(v), 1) 
where R is the square root on the right of that equation, whence 
a =(d'+C)/R. (P.4(v), 2) 


Therefore 
e’ — y —y,+4’P' /a’ 
= (a@'+C)1[d'(D—1)y1+Cly’—y,)}.  (P.4(v), 3) 


When C = 0 this becomes e’ = (D—1) y,, as it must. 


P.4(vi). If one, say, doubles the size of the object at infinity one 
evidently doubles the slope of the initial rays, all of which are 
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mutually parallel. Therefore y’ is proportional to B/«. From (14.35) 
one has, in the paraxial limit, y’ = f(@ —m§’), and taking the limit 
m -> 0, the required constant of proportionality is seen to have the 
value f. 


P.4(vii). Consider the plane through E normal to a family of 
initial rays. Let a particular such ray Z intersect this plane in the 
point D and 6” in the point D’. Then V(D, O’) is a function of 8 


arenes P(D, O') = g(0). (P.4 (vii), 1) 
Moreover, VD, O') = V(D, D’)+ V(D',O’), 
ie. g(S) = Wooly',2', By) +[2? + (fBle—y'P + (fy/a—2'P 

(P.4 (vii), 2) 
where (15.14) has been used. Equation (P.4 (vii), 2) can be rewritten 


immediately in the required form (15.20). 


P.4(viii). Allowing for the fact that the functions g(¢) which occur 
in V and J respectively need not be the same, we have 


v= V—-V=(1+8-29+ Ct (1 +6 —2Dy + DX) + g*(6). 

(P.4 (viii), 1) 

It is best to write this first as 

v =(1-+u)?—[(1 +u) —2(D—1) 4+ (D*- 1)¢]? +g*(6). 

(P.4(viii), 2) 

Using the power series for D, this becomes 

v= (1+u)t—-{1 +u—2d,(yf— ©) 
—[2d,62 — (2d, + 42) @]...34+2*(€). 


Expanding this in the usual way one finds, on rejecting terms of 
degree exceeding six and terms depending on € alone, that 


v® = dnl, v® = 4[—d,Enf+d, $C? + 2d, 97°C 
+ (2d, —3d,) 907]. (P.4(viii), 3) 
Note that in (23.2-3) the non-zero characteristic coefficients are: 


Ree | _1 = = 
Ps =a, S5= —td,, s,=3d,, Sg= d,, Sy=d,—3d,, 


(P.4 (viii), 4) 
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and one then finds that the only non-vanishing effective coefficients 
are = od 

ps =d,, 5 =d,, (P.4 (viii), 5) 


consistently with (P.4(v), 3). 
When V is given by (P.4(iv), 2) one finds after the same fashion 


es of = $e, 86+ (dye) 1b. (P.4 (viii), 6) 
The only non-vanishing coefficients in (19.6) are 
y=, Of = d,—C. (P.4 (viii), 7) 
P.4(ix). Relative to coordinates with origin at J’ (Fig. 4.2) the 
equation of a meridional ray is (with d’ = 1) 
y = —pxt o,p', (P.4 (ix), 1) 


where op? is neglected compared with p. Differentiating (P.4 (ix), 1) 


with respect to p, 3075p? = x, (P.4 (ix), 2) 


and upon eliminating p between the two preceding equations there 
comes 


270 y" = 4x9. (P.4(ix), 3) 
In view of the axial symmetry of the caustic surface, this result 
implies (19.11). 
P.4(x). We proceed after the fashion of the work following equa- 
tion (19.9). Write 2nvf) = b, p = —cpy, x = kp2”-*, so that 

J = bp — kp"p. (P.4(x), 1) 
Differentiation with respect to p leads to the equation 
(2n—1) bp"? = hp}, 


or k = (2n—1)bc?"-*. (P.4(x), 2) 
(19.9) then becomes 


2(n —1)c?"-14. (an—1)c2”*2-1 = 0. (P.4 (x), 3) 
Removing a factor (c+ 1)? on the left, 
2n—3 
3 (-1641)e =o, (P.4(x), 4) 


which is identical with (21.7). 


2I BIT 
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P.4(xi). By inspection of (19.7) and (25.1) the conditions are 
(i) Pa =0, (ii) sy = 3p, (iil) 55+ 457 = 2pg+ $Pa— Ps. 
(P.4 (xi), 1) 
P.4(xii). The displacement in ¥’ is given by (P.4(v), 3). The effec- 


tive distortion is the displacement for the ray through E’. Hence, 
setting y’ = o, and bearing in mind that e’ then vanishes by hypo- 


thesis, we must have D=C+L (P.4(xii), 1) 
which is the required result, in view of (P.4(iv), 3). 
P.4 (xiii). From (P.4(iv), 1) and (P.4 (xii), 1), 
X’=D-1, Y?4+Z"%= CD". (P.4 (xiii), 1) 
Setting D=1+4+4,6+4d,¢7+..., we therefore have 
X'’=4,C+d,0?+.... Y%4+2Z27% = C4+2d,07+.... 


Eliminating € between these equations the required result follows 
immediately. 


P.4(xiv). We have 8’ = (y’—y,)f(u), where a dot denotes differen- 
tiation with respect to u. Hence ’? = 1 —uf?, and therefore 


e’ = (y’—y,) [1 +(1 — uf?) #f]. (P.4(xiv), 1) 
One sees immediately that ¢,?+,? is a function of u only, and a’ 


likewise depends upon u alone; and this implies the stated result. 
Taking 2’ = o as usual, one has 


€y |; = (y'—y,)/2", 
so that the circle is not concentric with J’. 


Chapter 5 


P.5(i). Since coma of all orders is definitely present, third-order 
coma must be present and for this, and therefore for all orders, the 
comatic ratio has the value 3, i.e. 


K = 3K. (P.5 (i), x) 


Equations (33.5-6) become, since d = 0, 


HK) ans gr 2K 
3K = Fe tan? p io 
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and, in view of (32.6), i.e. tang’ = —p, one has the differential 
equation dK 
pls+p') T= 2K, (P.5 (i), 2) 


const. p? 
I+p? ° 


which has the solution K= (P. 5 (i), 3) 


P.5 (ii). We are concerned with rays through O and E’, and for 
these p = 0, i.e. § = y =. Let 


v = 79D(C)+...; (P.5 (ii), 1) 
where the dots stand for terms at least quadratic in p. Then 
B’ = y,[(1 +6) 4+ DE], (P.5 (ii), 2) 


since y’ = o. Let y’ be the angle between the axis of K and the line 
E’l’, and take 2, = o as usual. Then we define as the analogue of 


PAPEROCUBOEY! ec piy ig ooh (P.5 (ii), 3) 
and so Ap = h'D(6). 
P.5 (iii). Set x, = m in (40.5) and use (40.11). 
P.5(iv). The result is 4(n—2)(n—3)k;—2(n—3)kg3+3ky, but 
one should remember the powers of c absorbed in &,, cf. (51.2). 
P.5(v). Setting 6, = p. = 0, and p, = 0 (a = 1,...,5) in the first 
two members of (45.9), one has the pair of equations 

P’Po tht —mM*/m) = 0, bp*pg—j,(1 —sm*/m) = o. 
Their mutual compatibility requires that 

p(t —sm?/m) + b(1 — m3/m) = o. 

With (45.5) this reduces to m? = 1. 
P.5(vi). One here has 

q’~r=Nr\(N'-N), q-r= —N'r|(N’—N), 
and f=(NN’'r)/(N’—N), so that incidentally f=1 implies 


(N’—N)r = «-1. Use of these equations in (4.5) gives the required 
result at once. 


P.5 (vii). By the usual argument (cf. P.4(iii)) 
Vt = gt()—[1 —2(D—1) 4 + ¢D*}3, 
where D is defined by the equation Y’ = D(£)y,. 
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Chapter 6 
P.6(i). Yes. In the limit k > —1 equation (63.12) becomes 
x = f(m*—1)(y?+2%). (P.6(i), 1) 
Then (63.13) becomes at the same time 
X’ = 4(m-?—1)(Y" +2”), (P.6(i), 2) 


which is also the equation of a paraboloid. 


P.6(ii). Let z = 0 in (63.12) and let |x| and || become large. The 
equation becomes that of a pair of straight lines, that is, of the 
asymptotes, and one reads off the result 


tany = k-(k? — 1) (m? —1)-. (P.6(ii), 1) 
In the same way from (63.13) | 
tan yr’ = m(k®? —1)2 (m?—1)-4. (P.6(ii), 2) 
Replacing k by tany’/mtany in (P.6(ii),1), the stated result 
follows. 
P.6(iii). It is advisable to use y rather than y, as coordinates here. 
oe V = 9(f)—(1 + £—2mq +m )t+o(€, 4,6),  (P-6 (iii), 1) 


and since, when s = 1/m, O,C = CE’, this function must be in- 
variant under the mutual interchange of y and y’, i.e. of & and ¢. 
To the required order we therefore have to consider the invariance 
of 

~ $(E — amy + mE) — 3(m/[f) b+ 3(E — 2my +m? 6)? + vOE, 7, 6). 


Note that d’ = (s—m)f = (m~!—m) f = 1, so that the coefficient of 
6 is —}(m?+m/f) = —4, as must be the case to ensure invariance 
of the paraxial part of V. We are left with 


VANE, UB g) a ag ay 2my - mC)? = vA, q, £) + HE a 2my + mt). 
(P.6 (iii), 2) 


Rejecting the unwanted relation which involves p, we have the one 
remaining relation ips pal, (P.6 (iii), 3) 


With the present coordinates h will appear in e’ in place of h’. 
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Going over to h’ therefore we must replace p, by mp, and p; by m°,, 

the new coefficients being those which occur in (19.1). Bearing 
(19.6) in mind we thus arrive at 

(o4/m?) — 05 = }(1 — mm. (P.6 (iti), 4) 


When d’ + 1, we have to supply a factor d’-? on the right. But we 
have seen that d’ = (1 —m?) f/m; so that when / is now taken to be 
unity we have to supply a factor m?(1—m?)-? on the right of 
(P.6(iii), 4); and the latter thereupon becomes identical with (56.12). 


P.6(iv). V, refers to a curved posterior base-surface, not &. 
P.6(v). Write (1 —2£)# (1 —26)3 = Dd, t”. (P.6(v), 1) 


Differentiate with respect to ¢ and then multiply by (1 —¢&) (1 —#€) 
throughout. There comes 


— 2[O(1 — 26) + (1 — t€)] SP, t” = (1 — 28) (1 —t€) Ding, 274, 
whence 
—3(§ +6) Lbnt” + EOO9,, t”*1 — Sind, t7-1 4 (E+ 6) Sng, t” 
—£6>d.nd, tt =0. (P.6(v), 2) 


The factor multiplying ¢*(s = 0,1,2,...) must vanish, and this 
requires just that the stated recurrence relation be satisfied. 


Chapter 7 
P.7(i). Setting P = 1, J = 1 in (82.2) one has 
ey = (R-Q)7+(R-Q)r]y'—(1-R) 1, 
é, = [(R-Q)9+(R-Q)7]2'+Ry, 
with z, = 0, as usual. Introducing polar coordinates in the cus- 


tomary way, e’ becomes a function of 26, and the terms independent 
of 0 give the coordinates of the centres of the zonal circles. They are 


9 =[HR-Q)P?-(1-R)]h', 2 =[HR-O)p? +R]. 
| (P-7(i),2) 
A straight line is generated if and only if ) = k? for some constant k, 


i.e. E(2w®C+C)+2C = k[E(awC+C)+2C]. (P.7(i), 3) 


\ ean 


326 HAMILTONIAN OPTICS 

Set C=k°CH+Y. 

Then the equation for I" is 
2f(1+6)0+(€+2)T =0, 


the solution of which near = 0 is const. £—1, which means that T 
must vanish, taking regularity into account. Hence C' and C must 
stand in a constant ratio to one another. 


P.7(ii). The possibility of determining $ and m is assured if the 
set of equations (79.3) is soluble. The discriminant of these equations 


turns out to be [d-Mbe(1 — pq) |? = d-* + 0, 
which was to be shown. 
P.7 (iii). Since e’ = —(0T/0B')—m(0T/0B) by definition, one 
merely has to introduce the derivatives with respect to £,7,¢,7, 
which is easily done with the help of (15.11) and (72.4-5). 
P.7(iv). By (80.3), writing 7 = @ 64+ G59 +...+G5C', 
Gg = (s—m) A*t*O 

= (s = m)? (1 6 Occ _ 82x mng at Cie) (%e Sloe F Gs Ent a Gu 7°) 

= 8(s —m)? (895-293 + 3911). 
The coefficients govern ninth-order quintic skew coma. 


Chapter 8 


P.8(i). As in P.6(iii), one should use invariants defined in terms of 
y, & rather than m,y, m,z. Taking d’ = 1, 


m 
V = const. — Mab Make _(, +£,—2m,,+m?¢, 


+ £9 — 2m Ig +m ay +v, (P.8(i), 1) 


all non-linear terms of g(¢,, ¢,) having been absorbed in v. A sharp 
paraxial image is supposed to exist. B’, with which LE’ will be taken 
to be coincident, is chosen to be so situated that O,C = C’B’ in the 
usual notation. Then V must be invariant under the simultaneous 
interchange of &, with ¢, and &, with ¢,. In the first place 

(1-—mif, = (i —m5) fe ee (P.8(i), 2) 


mM, Ms 
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Considering the quadratic terms of (P.8(i), 1) one may omit those 
varying as €7,0169,03, £3,182, 3, £101, 5262, 1a; the first three 
because they will give rise to relations involving irrelevant coeffi- 
cients (i.e. those which do not enter into the displacement), the 
next three because they become redundant when the first three are 
rejected, whilst the last three are in any case not affected when &,, 
&, are interchanged with ¢,, ¢, respectively. One is left with the quad- 
ratic terms 


My PybiIy + Me PsFy 72+ My Poi Sat Mes P7boMo+ M3 Py b1 So 
+ mi py SS My Ps 1 Sa + My M3 Pia Sa + PisMiMsS Ne 
+ Prs™2 Nobo t+ 3( — 418111 — 416 No — 4mm I S2— 4g Eo Mo 
+ 213 ba + 2mmi Cy bo — 4m, Sy — 4m, me, Co — 4mm, C, Np 
— 4m 260). (P.8(1); 3) 


The aberration function v® has been taken to be defined in terms of 
the more usual variables y,(= m,y) and 2,(= m2), and the p, 
are therefore those in (88.6). Since V is, however, contemplated at 
present as a function of y and z one has to supply the various powers 
of m, and m, which appear in (P.8 (i), 3). By inspection one now reads 
off the five third-order relations 


Pa-Mipyy = $(1 —mi)),. 
Ps—Mipis = 3(1 —m)), 

M3 Py—Mipry = 4(mi—m}), (P.8(i), 4) 
Pe—M3Pra = 2(1 —m)), 


Py ~— M3 Pig = 3(1 —m}). 


P.8(ii). The total number of coefficients of a polynomial of degree 
n+ 1 in four variables is $(n +2) (+3) (+4). A typical term is of 
the form c, ,,,y'"—*+ly4-#2'~ 2”, which takes a factor (— 1)“ when the 
signs of 2’ and z are simultaneously reversed. Thus ¢,,, = 0 when 
fe is odd.When n is even the number of coefficients which so vanish 
is 75(n+2)(n+3)(m+4), and there are still 4(m+2) coefficients 
multiplying powers of y and z alone. One is left with the number of 
coefficients given by (88.14). The case of odd m may be dealt with in 
the same way. 


328 HAMILTONIAN OPTICS 


P.8 (iii). One possibility (which is particularly well adapted to 
straight line images) is to take the posterior base-surface to be a 
circular cylinder W, through E’. The axis of the cylinder is normal 
to the meridional plane and passes through the point (0,+y,, 0). 
The equation of W, is evidently 


wx’ = 1—-(1—£4+27)2. (P.8 (iii), 1) 
Then Ve = 9(6) -(1 +6 +78 +0, (P.8 (iii), 2) 
in view of (86.1), and this defines v,. Referring to Section 36, one 


has OW, oV, 


Peo ed re t en oes 
B= sO Xs = 5 PBGii) 3) 


where X’ is defined by (37.5). In (P.8(ii1), 2) only v, depends upon 
y’. In the usual way one then finds 


Ov, 


gE, 


d= Xb (ea) - X04 S474], 


P.8 (iv). We now have generically 
VY’ =y,A(G, 6), 2’ = 2 BO, 62). (P.8 (iv), 1) 


There can be no term of the form 2, C(G,, ¢,) in Y’, since when the 
y- and ¥’-axes are inverted Y’ and y, reverse sign independently 
of z,; and so on. By the usual argument 


V = g(b, S)—[1+ (mA -y' P+ (% B27}. (P.8(iv) 2) 

Write for the moment 
A=14+4,64+420+0(4), B= 1+b,0+6,6,+ O(4). 
Then the quadratic terms of (P.8(iv),2) other than those which 
remain when a, = a, = b, = b, = o are 
V) = A, 6, + 4991 Cot 5161 92+ deeb, (P.8 (iv), 3) 
terms depending upon ¢, and ¢, alone having been rejected. Com- 
parison with (88.6) then gives 
A=14+Pi36tPiubet O(4), B= 1+Pisbi + Piebs + O(4). 


(P. 8(iv), 4) 
All other third-order coefficients vanish, as they must. 
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P.8(v). Not in general. Write (88.8) as 


3 =p+acos20+bsin20, 2 =q+ccos20+dsin20. (P.8(v),1) 


Then, for example, when ad — bc = 0 a zonal curve degenerates into 
a segment of a straight line. No matter what the values of the co- 
efficients may be, provided only that 


(3P1—Pe) Pe 
(Ps — 322) Ps aa a 


there will exist a value of y such that this degeneration occurs. The 
zonal curves are circles if and only if 


Ps—3P7 = +2p, and 3p,—p,e=+2p,. (P.8(v), 3) 


| Chapter 9 


P.g(i). The number in question is the number of coefficients 
governing the terms of degree m in y,7 in the function G of equation 
(99.4), and this is +1. 


P.g(ii). When # = y =o 
“Y= 2a tS, - 7) Ble’, —2' = (J, -q')y'/o’. 
(P.9 (ii), 1) 


In the absence of spherical aberration y’ and z’ must vanish. Thus 
J has to satisfy the two conditions 


J,=0, J,=q when f=y=0. (P.9 (ii), 2) 
Reference to (95.4) at once shows these to imply that 
k, = 0, kh, =; pi=~Ps=ps=09, (P.9 (ii), 3) 
consistently with (95.6—7). 


P.g (iii). A circular patch obtains if and only if the factors multi- 
plying f’ and y’ respectively in (P.g(ii), 1) are equal. Therefore 


one must have y sg: her: pay=o. |) (P.9 (iii), 1) 
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Chapter 10 
P.10(i). Let m tend to zero in (106.7). Then 
Ye = (K+Sc)6+(1-K+SCq)B, Yi =U. 

Also e’ = s(c,o+¢p). (P.10(i), 1) 
From these equations one gets the desired result 

€, = 9(K+5¢,)“1 {e, yo + [Keg + (K-31) c] yi}. (P.10(1), 2) 
This will be found to be in agreement with (106.11). 
P.x0 (ii). Taking g’ = g =7 = 1 and N = 1 in (4.5) one has 

T = [(N’—1)?+2N’x}, (P.10(ii), 1) 
where y is given by (65.10). Setting N’ = N’(1+vj,@+...) and 
0 


expanding T in ascending powers of w, one gets the stated result 
without difficulty. 


P.10(iii). Write (101.4) in the slightly more general form 
wo = dA/(1+ada), (P.10(i11), 1) 
whence 6A = o/(1 — aw). (P.10 (iii), 2) 
Now, if dA* = A- o then 
w* = bA*/(1 + a* dA*) 
= (SA-p)/[1+4%(0A— p)], (P.10 (iit), 3) 
where “= ee Inserting (P.10(iii),1) into (P.1o(ili),3) one 


gets an expression for w* which is linear in w provided one chooses 


a* = a/(1 +a). (P.10 (iii), 4) 

Then of =(14ap)[(r+ap)o—p]. — (P.x0(iii), 5) 

P.10 (iv). In (107.3) K = 0, and therefore JR = 1, in view of (31.1). 

Hence K = 0 if : C = 2£JPS. a (P.10(iv), 1) 
1 1 001 


However, JP = (1 +6)“ by (31.4) and it suffices to set JP = 1. 
00 0 00 
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Chapter 11 


P.11(i). Consider V = f Nads as a function of the coordinates of 
points on the ray along which the integral is extended, so that some 
initial point on the ray is kept fixed. We denote them by x, y, z 
although primed symbols would be more consistent. Then 


dV |ds = N, (P.11 (i), 1) 


bearing in mind that d/ds denotes differentiation along the ray. 
Take the gradient of both members of this equation, so that 


d(grad V)/ds = grad N. (P.11 (i), 2) 


Equations (112.4) were derived directly from Fermat’s Principle. 
From the first three we have 


grad V = Grad N, (P.11 (i), 3) 
and with this (P. 11(i), 2) becomes just the required equation. 
P.11(ii). Since p = (1, 0, 0) the three equations (111.7) read 

(b'x'/N’) —(ba/N) =o, a'f'/N’ =aB/N, a'y'/N' = ay/N. 


(P.11 (ii), 1) 
Squaring and adding the last two, 


(a'/N’)?sin? f’ = (a/N)*sin? ¢. (P.11 (ii), 2) 
On the other hand 
N” = a’ +(b'—a’)cos*$’, N*®=a+(b—a)cos?¢. (P.11(ii), 3) 
Then (P.11(ii),2) becomes Q’ = Q, where 
asin? d 


Q= a+(b—a) cos? d’ 
Solving for sin? 4’, we get 


sin? d’ = FE ao’ (P.11 (ii), 4) 


which is, in effect, just the required equation. 


Chapter 12 
P.12(i). The surface Y in question must obviously be a surface 
of revolution. Let A(o, 0, 0) be its pole, B(—g, 0,0) and B’(q’;0, 0) 
the required points (q, g’ > 0), and P(x, y, ) the point of incidence 
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of some ray through B and B’. Then V(BPB’) must have the same 
value for all such points, i.e. 


N'[(q' — x)? + (9? + 2*)]F-+ N[(q—x)? + (9? +29)? = N’q' + Ng. 
(P.12(i), 1) 
We had taken g and q’ to be positive. If the object point is virtual, 


let the coordinates of B be (q, 0, 0) with g > o. Then (P.12(i), 1) has 
to be replaced by 


N'[(q' —2)2+ (9° + 2°)}# —NI(q—x)?+(0?-+ 2°) = Ng’ -No. 
(P.12 (i), 2) 
In general S is therefore of the fourth degree. 
P.12 (ii). Let Y be a sphere of radius 7, so that 
y+ 22 = 29x — x, (P.12 (ii), 1) 
Using this in (P.12(i), 2), there comes 
N’[q'?-+ 2(r —q') x]? —N[q? + 2(r—9) x] = N’q' -Nq. 
This is possible only if 
N'q =Nq and N'%(q'—1r) = N%q—-r). (P.12(ii), 2) 


(Were (P.12(ii), 1) used in (P.12(i), 1) one would not be able to 
satisfy the resulting equation.) (P.12(ii), 2) yield 


N'q' = Nq =(N'+N)r. (P.12 (11), 3) 


The locations of B and B’ are therefore determined, and both points 
lie on the same side of . Note that 


q/q=m=N/N', (P.12 (ii), 4) 
so that the actual magnification associated with B and B’ is (N/N’)?. 
P.12 (iii). By differentiation of (117.17) we have 
Nay = f(1 +-2Kx)*(#’B — a8’) + N(g—7)B, 
N'a'y’ = f(r +2Kx)-* (a’B—aB’) + N’(q' —7) 8’. (P.12 (i), 1) 
If spherical aberration of all orders is absent, y and y’ vanish 


together, and so AIN(g—1)8] = 0. (P.12(iii), 2) 
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Using (P.12(ii), 2) and (P.12(ii), 4), this reduces to 


8 —mB' = 0, (P.12 (iii), 3) 


i.e. the sine-condition is satisfied. 


P.12(iv). According to (120.26) the vanishing of third-order 
spherical aberration and coma would require that the two equations 


pte=0, Op+dc=0 (P.12(iv), 1) 


be simultaneously satisfied. Since ¢ +0 by hypothesis (y + 0), 
one would have to have 0— 6 = o. However, 


6-6 =rdly +0. (P.12(iv), 2) 


P.12(v). Write 

a=V?+W"?, b=VV'+WW’', c=V?+W*. (P.12(v), 1) 
Then, from (119.9) 
S = (flr —KX+ $e") [v'(1 — ge + $c?) — (1 — a+ §a*)] 


+ Nrt}AV+O(6).  (P.12 (v), 2) 
Here | 


X= 1— (ea + BB+ yy’) =1-(1t¢)4(1+a)4(1 +8) 
= $(a—2b+c¢) —4(3a"—4ab + 2ac — 4bc + 3c”) + O(6). (P.12(v), 3) 


Inserting this in (P.12(v), 2) and selecting the terms of degree 5 in 
V’, V, there comes 


g® = f{— 3(va? —v'c?) — }x(va—v'c) (a—2b +c) 
+4(v' — 2) [k(3a? — 4ab + 2ac — 4be + 3c”) + 3x2(a—2b+c)*]} AV. 


(P.12(v), 4) 
The factor multiplying S(.S? + S?) in (g® is therefore 


af(R— 1) i[ — 3004 + 30'v4 — 2K(v' — v)? (vo? — v'v?) 


+ K(v' — 2) (30'4 — 40'8v + 20"*v" — 4v'v3 + 304) + 3K°(0' — v)]. 
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Taking out a factor v’ — v from the expression in the square brackets 
and noting that «(v’ —v)? = 7’, this becomes 


3f(k —1)?2?[ —vo'(v'"? + vo’ +: 0®) +: (v2 + 07) +070]. 
The expression in the square brackets clearly has a factor 
it’ —vv' = cy(t' +2), 


the other factor being (k?—k+1)i?+3(k—1)v+3v*. Recalling 
(120.14) and (119.8) the required result now follows at once. 


P.12(vi). The first relation follows immediately from (120.12), 
(120.14) and the second member of (128.12). Then 


1635+ 43, = O(a’ + v"?—200' +02) = @(k2—Rk+1)i%. (P.12(vi), 1) 


P.12(vii). The invariant Na-\( Y*y —Z*) has the same value at 
the jth as at the first surface. As after equation (128.19), one has 


(s—m) fx* Na" Y*y — Z2*f) 


-_ ao, bz Oo; /ly) + (Fy Cuz — oO, ony) i (Hy Oo — fz Osy)} a O(6) 
= aor, fy — Fe fy) [1+ (8, +8,)] + O(6). (P.12 (vii), 1) 


Hence a'1+6,4+6,) = a (1 t+0,1+ 6,1) + O(4), 
or by, +85 = (4/%—-1)4+9,44+9,1+ O(4), (P.12 (vil), 2) 


which is, in effect, the required relation. 
P.12(viii). We have the given equations 
j 
Xs = A;e+ 2X (ax + A34,X8 + Ag, x2 +...). (P.12 (viii), 1) 
t= 


Let Xz = A;X+Ag;x8 + Ag x + Agjx?+.... (P.12 (vill), 2) 
Inserting this into both members of (P.12 (viii), 1), there comes 


a;x+ Ag+ Ag+ Agix' +... 
j 
=a;x+ ¥ [a2(a,x+ Ayr? + Asx? +...) 
i=1 


+ ay(a,x+ Ag+ ...P+ay(ax+...)?'+...]. (P.12(viit), 3) 
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Suppressing the index 7 on the right for the sake of clarity one reads 
off the relations 


j 


Ay; = > aa,, 
i=1 
j 
Ag; = © (3074, A2+ a°as), (P.12 (viii), 4) 
i=1 


j 
Ay = > (3@°a,A3 + 34a,A3+ 5ata, A, +a7a,), 
i=1 


and so on. The first of these gives A,, in terms of given constants, 
after which the second gives A;,, and so on. 


LIST OF PRINCIPAL SYMBOLSf 


Roman symbols 
measure of primary asphericity (= 47°s,—1), 280 
coefficient of linear transformation of direction cosines, 115 
axial point of surface of revolution, 14, 270 
A’ initial and final points of ray, 5 
the generator (s — m)? [4(0/0& 0¢) —(0?/0y?)], 127 
generator adjoint to A, 127 
3(s—m) A(v"), 299 
axis of symmetry of axially symmetric system, 22 


RR eye wR SD 


axis of system at least doubly plane-symmetric, 191 


b I—p (= (s—m)/(s—m)), 119 

b  (1—p)/(x—pq) (= (¢—m)/(8—m)), 122 

5 __ coefficient of linear transformation of direction cosines, 115 
B, B’ base-points, i.e. axial points of #, B’, 39 


B’ moved posterior base-point, 95 


B magnetic induction vector, 258 
B,B' anterior and posterior base-planes, 7 
Be curved posterior base-surface, 98 


c 1-—q (= (s—m)/($—m)), 97, 116 
(1 —9)/(x —pq) (= (s—@)/(8—m)), 122 


c axial curvature of Y, 272 


is) 


¢ coefficient of linear transformation of direction cosines, 115 
c,(m =1,2,...) coefficients of C(E) (= fl#*), 83, 121 
C,(% = 1,2) paraxial chromatic characteristic aberration coeffi- 


cients, 230, 236 


+ The page numbers refer to the place where a symbol with the given meaning 
first occurs, 
[ 336 ] 
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C,(% = 1,2) chromatic mth-order coefficients of c,, 231 
m 


C16. effective paraxial chromatic aberration coefficients, 238 
C axial point of central plane, 137, 156 

C(g) (2f/09)y-c-0(f = v, f, soos 83, I2I, 239 

CE) (00/8) cam 184 

C(g) one of a pair of functions describing .4*’, 151 

6 a curve, 4 

C central plane of reversible or concentric system, 136, 156 


C axis of symmetry of toroidally symmetric system, 212 


d _length of straight segment of ray through E’, 244 

d ___ distance between O, and E, 43 

d' distance between B’ and Oj, or E’ and O4, 19, 42 

d,d' values of d, d’ constant upon shifts of O, or E’, 97 

d; _ distance between A,,, and A,, 272 

d coefficient of linear transformation of direction cosines, 
115 

d* —m(1~—m?)f, 138 

d™ — (an—1)th-order part of D= T—T, 126 

Be oj, ee og (2n — 1)th-order coefficients of d,,d,, 302 

D deformation of the wavefront, 105 

D T-T, 126 

D length of straight segment of ray, 244 

D’ point of intersection of # with posterior base-surface, 42, 

98 

Dg) one of a pair of functions describing %*', 151 

D,,(k = 1, 2,3) components of D, 258 

d,,d, | modified increments, 295 

D  A(B/v,)—pA@, 288 

D electric displacement vector, 258 

Dy) amplitude of D, 258 
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e Nn, 277 

e;(t = 1, 2,3) components ofe (= a, f,y), 251 

E, E’ points of &, &’, 42 

E,, E; points conjugate to E before and after. %,, 269 
E,,(k = I, 2, 3) components of E, 258 

e unit tangent vector to ray, 3 

E ___ electric field vector, 258 

E, amplitude of E, 258 

6,6" paraxial pupil planes, 42 


J mean focal length of K, 40 

f mean focal length of spherical surface, 276 

f aberration function associated with F, 19 

tr (mean) focal length of K as a whole, 274 

fn mth-order part of the aberration function f, 31, 224 
f™ > (an—1)th-order aberration function, 50, 226 

Stuf focal lengths of anamorphotic system, 193 

Ri anterior and posterior focal lengths, 40 


j ” characteristic aberration function of coordinate order m and 
chromatic order m, 224 
(n) me he 2 : : 
yy characteristic aberration coefficients of coordinate order y 


and chromatic order m, 226 


Ke characteristic aberration coefficients of order 27 —1 
(coefficients of f™), 51 


Sais general nth-order characteristic aberration coefficient, 31 
F general characteristic function, 17 
Fy general ideal characteristic function, 19, 224 


FF’ principal focal points, 40 


2 arbitrary additive function in F, 20, 47, 52 


G(x) angle characteristic of concentric system with central 
base-points, 158 
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G,  (2n—1)th-order coefficient of G(x), 158 


4 AA (Sas & = AA, AA,), 275 
g* AA*, 277 
G G,+G,, 285 
j-1 
G, 39 > Baw 284 


k 
G,. j P23) San 284 
t=j 
G, G,, gt G, yp 284 
ge; fifth-order pseudo-coefficients of g or g*, 2098 


h object height, 196 
h’ ideal image height, 55, 196 
H. 


j coordinates of intersection point of # with normal plane 


through point before Y conjugate to B, 275 


H; (reduced) coordinates of intersection point of # with 
normal plane through O,,, 278 


H,* a;,H, 277 
Hy; (reduced) coordinates of intersection point of Z with 
normal plane through E,, 283 


H magnetic field vector, 258 
Hy amplitude of H, 258 


Las tp paraxial constants relating to i, 273 
L,I angles of incidence and refraction, 3 
I’ ideal image point, 46 

i cy+B, 273 


ee conjugate object and image planes, 19 

J J" shifted conjugate object and image planes, 1 14 
fe curved object surface, 153 

Bd curved image surface, 150 


ji -(m—m)(s—m)+, 120 
22-2 
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qis(h—m)(s—m)-*, 123 

(co*—£P%)4, 83 

the generator (s—m)?(@/@n), 129 


wy Aw WS 


generator adjoint to J, 130 


N/N', 278 
(cc = $y(wt+31)tv+r), 117 
paraxial characteristic coefficients, 39, 172, 198 
generic symbol for optical system, 8 
offence against the sine-condition (= «,/h’), 88 


K/h’, 90 


AARNE Te 


Ky, Ky, tangential, sagittal, mean comatic asymmetry, 
61 


l, location of Oy, 273 
Lo —YqjlVq; (i.e. location of Og;), 285 
Lo (1—m?)-* (mE —2my +6), 139 


m reduced magnification associated with ¥,.¥’, 19, 41 
m value of m after % has been moved, 114 

M = (1—m*)-* [mE —(1 +m?) {+e}, 139 

M the generator (s—m)?(0/0¢), 127 

M _ generator adjoint to M, 127 

M V,-—mV;, 274 


n  wave-index (magnitude of n), 255 

N,N’ _ refractive index in object and image space, 3 

N(x, y, 2) refractive index function, 5 

N ratio of energy density to magnitude of energy flux, 260 
N  (1—m*)-*(E—2my+m£), 139 

n the vector (N,, Ng, N,), 250 

n grad V, 259 
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O, O' points of %,.%’, 19 
Og, O5 axial points of ¥,.%’, 42 . 
O> O% and Oj are equidistant from central plane, 137 
Op; O6; points conjugate to O, before and after Fj, 269 
O(n) order symbol (terms of order exceeding nm), 26 


O(n,) terms of coordinate order exceeding m whose chromatic 
order exceeds 7, 227 


p location of E, 273 
Po —(h—m)(s—m), 119 
Pi (@ = 1,..., 4) variable conjugate to ray-coordinate g;, 18 
Pa (%=1,...,6) third-order characteristic aberration coeffi- 
cients of symmetric system, 55, 61, 281 
Pa(% = 1,..., 6) third-order characteristic normal aberration 
coefficients of semi-symmetric system, 175 
Pu(% =1,...,8) third-order characteristic aberration coeffi- 
cients of c-symmetric or t-symmetric system, 
196, 215 
Pa(% =1,...,16) third-order characteristic aberration coeffi- 
cients of doubly plane-symmetric system, 201 
b.(® = 1,2,3) third-order characteristic skew aberration co- 
efficients of semi-symmetric system, 175 
p.(# =1,...,6) third-order characteristic aberration coeffi- 
cients associated with shifted pupil or object, 
97 
pa(% =1,...,6) third-order spherical point characteristic 
aberration coefficients, 101 
Pw Pa (& = 1, 2,3) third-order effective aberration coefficients, 
74 
Pas Pi(& = 1, 2, 3) third-order effective aberration coefficients 
associated with posterior coordinates in Wo 
| IoI 
Pag» Pag (% = 1, 2, 3) third-order coefficients of gj (org,), 302 


Pair Pay (% = 1,2, 3) third-order coefficients of W,;, 298 
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Poaj» Poaj(% =1,2,3) third-order coefficients of d,,, 302 


Pat» Pyaj(% = 1,2,3) third-order coefficients of d,;, 302 


9 
P point of incidence, 3 ; 
P= 1-—2w(dS/d£), 83 
P; &4+20,4+67, 298 
p  AN(E-A)i2 y(t, +02), 279 
p.j(% =1,...,.6) third-order pseudo-coefficients of t;, 307 
Daj Paj(% = 1,2,3) third-order pseudo-coefficients of g7 

(or g;), 298 
‘Pay Pog(& = 1,2,3) third-order pseudo-coeflicients of w, 298 


) unit normal to refracting surface, 3 


q — &'/(1 —’) = (§—s)/($—m), 96, 116 

gq location of B, B’ relative to centre of concentric system, 
157 

7 location of local base-points relative to A, 270 

g,(¢ = 1,...,4) ray-coordinates, 17 

Q 1—2w°(dC/d£), 83 

Q; 26,(7+4;6), 298 


ry _ radius of spherical surface, 14 

R 1+wC, 83 

R’ distance between E’ and I’, 100 
R; $76, 298 

Ram circle polynomials, 110 

# — genericray, 6 

Ry  base-ray, 26 

#%, _ base-ray at colour Ae 224 


s reduced. magnification associated with pupil planes, 42 
s 45(u) is the x-coordinate of points on refracting surface, 270 


Sn coefficients of power series for s(u), 272 
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Sn coefficients of S(= ft), 83 


§ value of s after shift of pupil planes, 116 


Sy(@=1,...,10) fifth-order characteristic aberration coeffi- 
cients of symmetric system, 62, 74, 
Ce (ca ane 10) fifth-order characteristic normal aberration 


coefficients of semi-symmetric system, 180 


5,(% = 1,...,6) fifth-order characteristic skew aberration co- 
efficients of semi-symmetric system, 181 


I, ..., 10) fifth-order characteristic aberration coefti- 
cient associated with shifted pupils or object, 
100 


(a 


sy(@=1,...,10) fifth-order spherical point characteristic 
aberration coefficients, 101 
st 5% (@ =1,..., 6) fifth-order effective aberration coefficients, 
24 
si,fi(a=1,..., 6) fifth-order effective aberration coefficients 


associated with posterior coordinates in Ws 


IOI 
st(a= T2655) 


= 3fe ne 6 
5% (@ = 1,...,6) ; 

wk 6 fifth-order effective aberration coefficients 
S3(@= 1.4 ) 


of semi-symmetric system, 181 
* ? 
#sh (a = 1, is 3) 


*5% (% =1,...,3) 

Sag(@#=1,...,10) fifth-order coefficients of t;, 309 

Saj Sag (%@ = 1,...,6) _ fifth-order coefficients of 27 Org,, 302 
“Sag Sug (% = 1, ..., 6) fifth-order coefficients of Wy, 302 

S the generator (s— m)*(6/0£), 127 


S generator adjoint to S, 127 | 
S(t) flE,0,0)(f =v, .--), 83, 121, 239 

S Vi-sVj, 274 

s vector from initial point to point of incidence, 4 

S energy flux, 259 

$3 (@ = 1,..., 10) fifth-order pseudo-coefficients of tj, 308 
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3,4 3aj(% =1,...,6) fifth-order pseudo-coefficients of g} or 
£;, 298 

‘8aj) Baj(% = 1,...,6) fifth-order pseudo-coefficients of w,, 298 


S surface of revolution, usually spherical, 14, 269 


t angle characteristic aberration function, 52, 236 


t, tof jth surface, local base-points, 282 
t= 11515) seventh-order characteristic aberration co- 
efficients, 74 
tl(a =1,..., 15) seventh-order point characteristic aberration 
coefficients, 101 
0", t% (a =1,..., 10) seventh-order effective aberration coeffi- 
cients, 74 
ti, fi (% = 1,..., 10) seventh-order effective aberration coefii- 
cients associated with posterior coordinates 
in Wo, 101 


i") (2n—1)th-order angle characteristic aberration coefficients 
of symmetric system, 73 


tr _ trace (of tensor): tr €yy = €,; + €g2 + €gg, 261 

T angle characteristic, 11 

T, ideal angle characteristic, 46 

foe angle characteristic, base-points Oy,O05, 283 
T, angle characteristic, base-points E, E’, 283 


Taj To; T,. Ty for jth refracting surface, local base-points, 
283 


T angle characteristic referred to shifted base-points, 119 
T focal angle characteristic, 125 
fd T of concentric system, base-points O$, Oo, 137 


T® — (a2n—1)th-order part of T;(= ¢f, > 1), 282 


E—ayn+, 47 
I—sm, 137 


uo yptZp, 270 
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ut), um) (2m — 1)th-order effective aberration coefficients of 
symmetric system, 73 

uy), a 

PAO) (2n — 1)th-order effective aberration coefficients 


ay : 
#,(n) #=(n){ Of Semi-symmetric system, 180 
u“ bp? u jv 


v point characteristic aberration function, 52 
v- S—m, 139 
v' spherical point characteristic aberration function, 100 


v* —_ normal point characteristic aberration function of semi- 
symmetric system, 174 


o* skew point characteristic aberration function of semi- 
symmetric system, 174 


%, ~—_—~point characteristic aberration function for arbitrary 
posterior base-surface, 99 


oy modified point characteristic aberration function, 99 


ot modified spherical point characteristic aberration function, 
102 


® point characteristic aberration function associated with 
shifted posterior base-plane, 95 


B u(0,y’,2’,y,8), 98 


VM, Yin) (2m — 1)th-order part of v, vi..., 10 
vy” _ part of &” varying with p, h' as pf, 67 
vy, OG ~~ (an—1)th-order (point) characteristic aberration co- 


efficients: coefficients of Um, Hr), 67 


vf}, o@ (2n — 1)th-order characteristic aberration coefficients: 
coefficients of v*), g#@), 175 


Vaj»Upj +=“: paraxial constants relation to B;orV;, 273, 274 

V point characteristic, 8 

V effectively the phase of electromagnetic field vectors, 258 
Vo Bla, 274 

V* — optical length, 4 

V optical distance, 5 


23 BIT 
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Vt spherical point characteristic, 100 


V;, point characteristic associated with arbitrary posterior 
base-surface, 98 


V_ point characteristic associated with shifted base-surfaces, 


95 
Vo ideal point characteristic, 20, 46, 49 


w  (1+£)8, 83 

ry I—m*, 139 

w%},m%)  (2n—1)th-order coefficients of w, 301 

W _ energy density, 259 

Wyle, 274 

W,, W, mixed characteristic functions, 15 

Wr Woo ideal mixed characteristic functions, 15 

w ogtle, 277 

107}, 10%} (2n—1)th-order pseudo-coefficients of w, 297 
W wavefront, 34 


W> posterior base-surface for V': ideal spherical wavefront, 
100, 103 
x ¥-coordinates of initial point, 5, 10 
ne x’-coordinate of final point, 5, 10 
current Cartesian coordinate, 5 
Xp x-coordinate of point of incidence, 270 


x amount of shift of object surface, 114 


Rx’ amount of shift of posterior base-plane, 95 
XX’ = (1—x")/a’, 100 
Xx’ x’-coordinate of points of %*’, 150 


y  ¥-coordinate of initial point, 5, 10, 16 
y' _ y’-coordinate of final point, or of B’, 5, 10, 16 
yy current Cartesian coordinates in object and image space, 


5 
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Yi MY, 47 
9 ¥'-component of partial displacement, 56 
YeVe  ¥-, ¥'-coordinates of points of @, &’, 61, 164 
3’ ~—‘-¥’-coordinate of points of shifted exit pupil, 97 
y" —-¥'-coordinate of points in W, 103 
y* —8, 172 
Yp —_ y-coordinate of point of incidence, 270 
Yai Yoj paraxial constants relating to Y;, 273 
Y __ y-coordinate of points of polar tangent plane of S, 285 
Y’ —¥’-coordinate of points of .%’ in the context of V, 19, 29, 


40 
Y’ —_—-¥’-coordinate of points .%*’, 150 
Y* ay, 285 
z %-coordinate of initial point, 5, 10, 16 


a &-coordinate of final point, or of B’, 5,10, 16 

2,2" —_ current Cartesian coordinates in object and image space, 5 

2 mz, 47 

%  (N’/N)z, 190 

2 —_’-component of partial displacement, 56 

Ce Ds &-, Z’-coordinates of points in 6,8", 61, 164 

& %’-coordinate of points of shifted exit pupil, 97 

a 2’-coordinate of points of Wy, 103 

sty, 172 

Bp -coordinate of point of incidence, 270 

4 — -coordinate of points of polar tangent plane of Y, 28 5 

Z' %’-coordinate of points of .f’ in context of V, 19, 29, 40 

Zz’ 2'-coordinate of points of %*’, 1 50 

Z* aZ, 285 

Bar ba (% = I,...,5) coefficients ancillary to 3,, 3, (% = 1,...,6), 
300 

da(% = 1,...,8) coefficients ancillary to8,(« = 1,..., 10), 308 


23-2 
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Greek symbols 
a x-direction cosine, 10 


a absolute invariant of order 7, 128 


r 


f y-direction cosine, 10 


Y z-direction cosine, I0 


6 variational symbol, 4 

6 longitudinal spherical aberration, 88 

A retardation of the wavefront, 106 

A defined by AX = X’—X for any X, 270 


) (04, Up — Vp Va)? (HA, + 04 AB,), 295 
5,5, increments relating to 6,p, 2386 
5,,5,, increments relating to S, M, 284 


€y  dielectrictensor, 258 

€,(m = 1,2,3) principal values of €4, 263 
é€ aberration of ray (displacement), 19 

€, _ nth-order displacement, 32 


e, _ mth-order displacement of chromatic order m, 225 
m 

¥*e! pseudo-displacement, 85 

é’ _— displacement after shift of O, or E’, 97 

€,(Wn) nth-order displacement induced by ¥,,, 33 


rotational invariant (appropriate to context), 36, 38, 48 


¢ 

C reflection invariant, I9QI, 200, 214 

C f+, 48 

€ corresponds to ¢ for non-standard reference planes, 95, 117 


rotational invariant (appropriate to context), 36, 38, 48 


BB+y'y, 48 


3] 38 
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reflection invariant, IQI, 200, 214 


corresponds to 7 for non-standard reference planes, gs, 
117 
tensor inverse to €,, 261 


angular polar coordinate in &, 55 


Up tay 274 
YolVay 281 
U/Vq, 288 


NN'/(N'—N)*, 14 
viv’, 235 
tangential and sagittal circular coma, 59, 84 


(—1)" (?) d*, 138 


wavelength, 1 
base-colour, 223 
Lagrange invariants, 29 
—NvH, 275 

ah, 275 


Ment, ‘m-invariants of order 2 —1, 129 


p 
Bj 


B —m8’, 76, 273 
B+d;, 296 


N/N, 230 
0 
N’/N’, 230 


dispersion coefficient of chromatic order m, 227 


rotational invariant (appropriate to context), 36, 38, 48 


reflection invariant, 191, 200, 214 
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E p?+y?, 48 
£ _ corresponds to & for non-standard reference planes, 95, 117 


T,(% = 1,...,8) second-order characteristic aberration coefi- 
cients of singly plane-symmetric system, 204 


p radial polar coordinate in &’, 55 


Po _ largest value of p, 56 


ao A(Neosl), 5 

o transverse spherical aberration: ¢;(h’ =0,@9=0), 84 
O,(& = 1, ..+) 5) Seidel coefficients, 55 . 
&,(% =2,3,5) | skew Seidel coefficients, 176 

Oon—1,0 s-invariants of order 2n—1, 118, 129 

os B-sP', 76, 273 

o+d,;, 296 


Mo 


summation sign, 18 


T skew rotational invariant, 37, 171 
T the generator (s—m)(@/@r), 183 


generator adjoint to T, 183 


T 
¢,¢' angles between . and initial and final axial rays, 84 
d 


Fells = m) Nri,|; 275 
@™ — factor multiplying p?"—*h’4 in o&™, 67 


x out-of-focus parameter, 54 
X I-(eol + BB + yy’), 158 


wv angular polar coordinate in-¥, 197 
yy u(ds/du)*, 271 


Vn any one nth-order term of aberration function, 33 


Ww chromatic coordinate, 224, 228 
Q primary chromatic part of (D—d)-sum, 244 
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Special symbols 
V defined by VX = X’ +X for any X, 270 
@  cA(1/N), 279 


G (s—m)-*6, 299 
Grad the vector operator (0/dx, a/8f, Ofey), 261 


A ffixest 
X" relates to the image space (of K or a particular S), if X 
relates to its object space, 3 


xX, if X; relates to Fj, the value k of j relates to the last 
surface of K, i.e. the surface adjacent to the image space, 
269 

Xx, nth-order part of X, or a coefficient of this, 31 


X@ =~ (an—1)th-order part of X, or a coefficient of this, where 
K is at least doubly plane-symmetric, 50 


part of X of chromatic order m, or a coefficient of this, 224 


xX 

x value of X associated with the base-colour A, 223 

x only when X = £, y, €: the particular rotational invariants 
E= BP? +, 7= BB+y'y, C= B+ 7%, 48 

X" relates to the spherical point characteristic, 100 

X refers to non-standard Positions of reference planes, 95 


X*, X* —_ jointly distinguish effective from characteristic 
aberrations coefficients, 74 


Xt, Xt analogous to X*, X* when considering a curved pos- 
terior base-surface, 101 
xX*, Xt dual of X,, X,( = —X,, X,), 172 
X distinguishes coefficients relating to skew aberration func- 
tion, 175 


xX*, X# jointly distinguish quantities associated with parts of 
the aberration function of semi-symmetric or c-sym- 
metric systems, 174, 194 


t X generally denotes any quantity appropriate to the context, and it functions 
merely as a carrier of the various affixes. 
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Miscellaneous conventions 


<X;> denotes the pseudo-expansion of X;, 297 


xX denotes any pair of quantities X,,, X, which transform as 
the components of a two-vector under rotations about 7, 
39 

Xx denotes ordinary three-vectors, 3 


A.B A,B,+A,B, 39 
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INDEX 


aberrations (see also aberration coeffi- 
cients, aberration functions, dis- 
placement), 19, 32, 225 
absolutely invariant, 127, 130, 167, 
183, 282 
astigmatic, 72 
asymmetric, 71 
chromatic, 223 
classification of (see types of) 
comatic, 72 
computation of, 265 
interaction between, 202, 216 
invariant, 118, 127, 130, 167, 183, 282 
m-invariant, 12'7, 131, 183, 238 
normal, 178 
orders of (see order) 
out-of-focus, 84 
semi-invariant, 127, 131, 133, 183, 
238 
S-invariant, 127, 131, 183, 238 
skew, 178 
symmetric, 72 
types of, 33, 56, 71, 112, 176, 201, 
204, 226 
aberration coefficients, 31 
characteristic, 31, 226 
chromatic, 226 
computation of, 265 
dependence on object position, 114, 
IIQ, 120, 182 
dependence on stop position, 112, 
I17, 122, 182, 238 
effective, 73, 76, 101, 149, 161, 
164, 180, 231, 234 
normal, 175 
number of (see number) 
orders of (see order) 
skew, 175 
aberration functions, 19, 31, 50, 224 
chromatic, 225 
geometrical interpretation of, 103 
modified, 99, 102, 205 
monochromatic, 225 
normal, 175 
skew, 175 
absolute invariants, 127, 130, 167, 183, 
282 
achromatism, 232 
of aberration type, 232 
simple and multiple, 233 


adapted coordinates, 173 
adjoint generators, 128, 183 
afocal system, 13, 144, 165, 193, 195 
Airy disk, 58 
anamorphotic systems, 19 
anisotropy, 1, 171, 192, 222, 248 
examples of, 256 
angle characteristic, 13, 52, 215, 235 
aberration function, 52, 182 
computation of, 281, 287, 290, 307 
of concentric system, 158, 163 
dependence on position of base- 
planes, 116, 119, 121, 182, 238 
focal, 125 
geometrical interpretation of, 13 
ideal, 47, 50, 215, 218 
invariance of, 23 
of paraboloid, 271 
and regularity, 25 
of spherical surface, 14, 271 
of surface of revolution, 270 
angle of incidence, 4 
angle of refraction, 4 
anterior base-plane, 16 
aplanatism, 91 
aplanatic points, 310 
apochromatism, 233 
a-ray, 273 
A-rays, 249 
aspherical surfaces, 271 
astigmatism, 60, 66, 72, 131, 132, 159, 
198, 202 
skew, 178 
astigmatic aberration types, 72 
astigmatic curvature of field, 60, 66, 
131, 178, 198, 202 
astigmatic focal distance, 29 
asymmetric aberrations, 71 
asymmetry of the image, 61, 65, 68, 88 
axial rays, 83 
axial symmetry, 22, 35, 170 
axis of symmetry, 22, 213 


barrel distortion, 60, 153 
base-colour, 223 
base-planes, 16, 99 
base-ray, 26, 224 
base-surface, 98 

basis (coordinate-), 25, 173 
biaxial crystal, 256, 264 
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b-ray, 273 
B-rays, 249 


Cartesian ovoid, 310 
caustic surface, 5’7 
central plane, 137, 156 
centre (of concentric system), 156 
characteristic aberration coefficients, 
31, 226 
computation of, 281, 287, 290, 307 
relation to effective coefficients, 74, 
78, 101, 234 
characteristic (functions), 11 
angle, 13 
cylindrical point, 211 
general mixed, 17 
geometrical significance of, 11, 13, 
15 
ideal (see ideal characteristic func- 
tions) 
invariance of, 23 
mixed, 14, 15, 21, 166 
point, 11, 253 
and regularity, 25 
spherical point, 100, 242 
chromatic aberrations, 224 
classification of, 226 
paraxial, 231 
primary fifth-order, 235 
third-order, 234 
chromatic coordinate, 224, 227, 229 
chromatic difference of magnification, 
231 
chromatic order, 224 
circle polynomials, 111 
circular coma, 58, 63, 65, 68, 83, 84, 
141, 159, 164, 176, 178, 197, 211, 
226, 240 
and isoplanatism, 91 
normal, 176, 185 
and sine-condition, 88, 240 
skew, 176, 185 
classification of aberrations (see aber- 
rations) 
colour, 223 
coma, 58, 72, 164, 204 
circular (see circular coma) 
cubic, 72 
elliptical, 65, 69, 133, 179, 197 
linear (see circular coma), 72, 197, 
204, 211 
normal, 177, 185 
sagittal, 59, 63, 84, 162 
skew, 177, 185 
tangential, 59, 63, 84 
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comatic aberrations, 72, 133, 146 
comatic asymmetries, 61 
comatic ratio, 59, 63, 68, 81, 92, 135 
complete reversibility, 137 
computational methods, 265 
computation of aberrations, 265, 287, 
292 
of fifth-order angle characteristic, 
309 
of fifth-order coefficients of g*, 300 
of third-order angle characteristic, 
281, 290 
of third-order coefficients of gf, 299 
of third-order effective coefficients, 
279 
concentric meridional section, 221 
concentric systems, 156, 192, 209, 241 
conditional invariants, 133 
conditions of integrability, 18, 28, 76, 
306 
congruence of rays, 2, 7 
conjugate variables, 18, 31 
conjugate surfaces and points, 41, 44 
convergence (of series), 25, 213 
coordinate basis, 25, 30 
adapted, 173 
continuous symmetry group, 24 
cosine-conditions, 94, 184 
modified, 185 
offences against, 186, 207 
cosine-relations, 94, 207 
crystal optics, 252, 256, 263, 264 
c-symmetry, 191 
curvature of field, 59, 66, 60, 82, 159, 
198, 202, 204, 
astigmatic, 60, 131 
Petzval, 60, 66, 131, 132 
curved image surface, 19, 82, 150, 155, 
165 
curved object surface, 153, 165 


D—d method, 242 
deformation of the wavefront, 106 
dielectric tensor, 258, 262 
differential equations for ray, 7, 262 
differential equations for V, 11,254,257 
diffraction focus, 109 
diffraction, neglect of, 1, 259 
disk of least confusion, 57, 68 
displacement (see also aberrations), 19, 
32, 51, 77, 80, 84, 100, 161, 175, 
189, 196, 201, 204, 216, 225, 278 
out-of-focus, 84 
partial, 56 
pseudo-, 85 
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distance, optical (see optical distance) 
distortion, 60, 66, 70, 81, 82, 141, 149, 
159, 174, 178, 199, 202, 204 
barrel, 60, 153 
functions, 174 
pin-cushion, 60 
double achromatism, 233 
double plane-symmetry, 199 
duality, 292 


effective aberration coefficients, 73, 76, 
101, 149, 161, 164, 231, 234 
electron-optical systems, 170, 229, 

249, 257 
elementary reflection invariants, 191 
elementary rotational invariants, 36 
elliptical coma, 65, 60, 133, 179 

skew, 183 

entrance pupil, 42 
exit pupil, 42 
extremum, 6 


Fermat’s Principle, 2,6, 8,243,240, 262 
field curvature (see curvature of field) 
final ray, 10, 174 

focal angle characteristic, 125 

focal curves, 34, 80 

focal lengths, 40, 193, 215, 225, 276 
focal lines, 29, 34, 50, 178 

focal points, 40 


Gaussian optics, 39 

geometrical optics, 1, 259 

generators of invariants, 128, 13 3,.183 
adjoint, 128 


Hamiltonian optics, 2, 266 

and Maxwell’s equations, 258 
Hartmann’s formula, 228 
Hockin’s (Herschel’s) condition, 120 


ideal characteristic functions, 19, 20, 
46, 49, 50, 100, 174, 194, 200, 
203, 215, 218, 224 

ideal image height, 41, 196 

ideal image plane, 20, 41, 225 

ideal image point, 41, 225 

ideal imagery, 194 

ideal wavefront, 103 

identities 

between effective aberration coefh- 
cients, 76, 102, 306 

parabasal, 27 

paraxial, 40 

between pseudo-coefficients, 301 
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image plane, 19, 41, 172 
out-of-focus, 54 
image space, Io 
image surface, curved, 19, 82, 150, 
155, 165 
initial ray, 10, 174 
increments, 285 
modified, 295 
integrability conditions, 18, 28, 76, 
306 
intermediate variables, 278, 282 
as functions of co-ordinates, 284 
internal anisotropy, I, 171, 192, 222 
invariance under symmetry operation, 
23, 37, 137, 157, 170, 190, 213 
invariant aberrations, 118, 127, 130, 
167, 183, 282 
conditionally, 132 
number of, 129 
semi-, 127 
skew, 183 
invariants (see also invariant aberra- 
tions), 
Lagrange, 29, 34 
optical, 24, 29, 191, 300 
quasi-, 2, 75 
reflection-, 191, 200 
rotational, 36, 38, 48 
isoplanatism, 9x 
isotropy, 1, 248 
iteration, 292 
for the fifth-order coefficients of g;, 
393 
for the fifth-order coefficients of ty, 
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Lagrange invariants, 29, 34 
Lagrangian optics, 266 

laws of refraction, 3, 5, 251, 255 
length, optical, 4 

light ray, 1, 255, 260 

line image, 204 

linear coma (see circular coma) 
longitudinal chromatic aberration, 231 


magnification, 19, 30, 41, 225 
chromatic difference of, 231 

Maxwell’s equations, 258 

mean asymmetry of the image, 61, 65 

mean focal length, 40, 22 5 

meridional plane, 35, 173, 191 

meridional rays, 40, 56, 81, 171, 173 

merit function, 109 

m-invariant (aberrations), 127, 

183 


131, 
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mixed characteristic (functions), 15, 
17, 166, 209 
geometrical interpretation of, 15 
ideal, 49 
of plane surface, 21 
modified aberration function, 99, 102, 
205 
modified increments, 295 
modified power, 41 
multiple achromatism, 233 


nodal planes, 43 

normal aberration function, 175 

normal aberration types, 176 

normal congruence, 2, 7 

normal plane, 39 

normal to wavefront, 1 

number of aberration coefficients, 31, 
32, 51, 54, 145, 158, 175, 188, 195, 
201, 203, 209, 210, 216 


object plane, 19, 41 
object shift, 114, 119, 120, 182 
object space, 10 
object surface, 153, 165 
oblique spherical aberration, 63, 69, 118 
offence against the sine-condition, 88, 
184, 239, 247 
offences against the cosine-conditions, 
186, 207 
optical distance, 5, 7, 8, 223, 249 
quasi-reduced, 229 
reduced, 16 
optical invariant, 24, 29, 191, 300 
optical length, 4 
optical (ray-) axes, 256 
optical system, 8 
afocal, 13, 44 
anamorphotic, 19 
anisotropic, 1, 248 
axially symmetric, 22, 35, 170 
c-semi-symmetric, 192 
e-symmetric, 191 
completely reversible, 136 
concentric, 156, 192, 209, 241 
doubly plane-symmetric, 28, 32, 199 
internally anisotropic, I, 171, 192, 
222 
plane-symmetric, 22, 28, 203 
reversible, 136, 186, 192, 208, 218, 
240 
(r-) semi-symmmetric, 170 
r-symmetric, 36 
symmetric, 35 
telescopic, 13, 44, 165 
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toroidally semi-symmetric, 212 
toroidally symmetric, 212 
translationally semi-symmetric, 190 
translationally symmetric, 191 
t-symmetric, 212 
order (of aberrations, coefficients) 
chromatic, 224 
monochromatic, 31, 33, 50, 112 
order of remainder terms, 26, 227 
orthogonal trajectories, 2, 7 
out-of-focus displacement, 84 


parabasal coefficients, 26 
parabasal identities, 27 
parabasal optics, 26 
paraxial calculations, 272 
paraxial chromatic defects, 230, 237 
paraxial constants, 39, 273 
paraxial optics, 26, 39, 172, 193, 273 
partial displacement, 56 
perfect imagery (seealso ideal imagery), 
18 
perfect planes, 120 
performance number, 109 
Petzval curvature, 60, 66, 131, 132, 
198, 202, 282 
Petzval sum, 282 
Petzval surface, 60, 198, 202 
pin-cushion distortion, 60 
plane-symmetric system, 22, 28, 203 
plane refracting surface, 20, 21, 269 
point characteristic, 8, 11, 98, 249 
aberration function, 52 
and anisotropic media, 249 
computational problem, 269 
cylindrical, 211 
dependence on positions of base- 
planes, 95, 114 
differential equations for, 11, 254, 
257 
geometrical interpretation of, 11 
ideal, 19, 20, 46, 49, 194, 203, 224 
invariance of, 23 
and Maxwell’s equations, 258 
of plane surface, 20 
and regularity, 25 
spherical, 100, 242 
point of incidence, 3 
polarization, 262 
posterior base-plane, 16 
posterior base-surface, 98 
power, 41 
power series, 25, 31, 50, 213, 224 
primary, meaning of, 50, 225 
principal axis transformation, 263 _ 
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Principal Function, 266 
principal planes, 43 
principal ray, 59, 149 
pseudo-coefficients, 297 
pseudo-displacement, 85 
pseudo-expansion, 295 
pupils (pupil planes), 42 

shift of, 112, 117, 122, 126, 182 


quasi-invariants, 275 
decomposition of, 276 
quasi-reduced distance, 229 


ray, I, 2, 255, 260 
axial, 83 
base-, 26, 224 
-coordinates, 17 
differential equations for, 7, 262 
final, 10, 174 
-index, 255 
initial, 10, 174 
meridional, 40, 56, 173 
Principal, 59, 149 
sagittal, 59 
tangential, 59 
reduced distances (see optical distance) 
reflection invariants, 191, 200 
refraction, laws of, 3, 5, 251, 255 
refractive index, 3 
of anisotropic medium, 249, 251, 
256, 257, 263 
normal, 257 
skew, 257 
regularity, 25, 30, 214 
remainder terms, 26, 227 
retardation of the wavefront, 106, 108 
relation to aberration functions, 107 
reversible systems, 136, 186, 192, 208, 
218, 240 
r-semi-symmetry, 170 
r-symmetry, 36, 213 
rotational invariants, 36, 38, 48 
elementary, 36 


sagittal 
asymmetry, 61, 65, 68, 88, 162 
coma, 59, 63, 68, 84, 162, 185 
focal line, 59 
image surface, 60 
plane, 35 
rays, 59 
secondary, meaning of, 50, 225 
secondary spectrum, 232 
second-order aberrations, 34, 50, 203 
Seidel coefficients, 55,112,140, 176,280 
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semi-invariant (aberrations), 127, 131, 
133, 183, 238 
semi-symmetric systems, 170 
sharp imagery, realizability of, 29, 141, 
150, 153, 165, 188, 194, 217, 218, 
220 
shift of object or stop, 112, 114, 117, 
119, 122, 182, 238 
sign conventions, xiv 
simple achromatism, 233 
sine-condition, 85, 93, 162, 184 
and chromatic defects, 239 
modified, 86 
offence against the, 88, 239, 247 
and sagittal circular coma, 88 
and tangential circular coma, 90 
sine-relation, 84, 184, 207 
s-invariant (aberrations), 118, 127, 131, 
183 
skew aberration function, 175 
skew aberration types, 176 
skew congruence, 2 
Snell’s law, 4, 255 
spherical aberration, 56, 63, 67, 83, 
84, 159, 162, 176, 179, 197, 201, 
226 
and chromatism, 233, 234 
oblique, 63, 69, 118 
spherical point characteristic, 100, 242 
sphero-chromatism, 232 
stop, 41, 196, 199 
shift of, 112, 117, 122, 182, 238 
Strehl intensity, 109 
summation convention, 258 
superachromatism, 234 
superapochromatism, 234 
symmetric aberrations, 72 
symmetric system, definition of, 35 
symmetries of system, meaning of, 22 
continuous, 24 


tangential asymmetry, 61 

tangential circular coma, 60, 65, 68, 
84, 90 

tangential focal line, 59 

tangential image surface, 60 

tangential plane, 35 

tangential rays, 59 

telescopic system, 13, 44, 165 

third-order displacement, 55, 176, 197, 
201, 216, 234, 279, 281 

toroid, 212, 220 

toroidal semi-symmetry, 212 

toroidal symmetry, 212 

torus, 212 
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transverse chromatic aberration, 231 ideal, 103 

t-symmetry, 212 retardation of the, 106 

types of aberrations (see aberrations) wavefront aberrations, 112 
wave-index, 255 

uniaxial crystal, 252, 264. wavelength, 1, 223, 228, 259 
wave-normal, 255 

variational principle, 6 wave-surface, 2 


vignetting, 41 
Zernike polynomials, 111 
wavefront, 1, 103, 259 zonal curves, 56, 196, 199 
deformation of the, 106 zonal rays, 56 
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